Saliency Detection via Full Convolution Neural Network and Low Rank Sparse Decomposition
-
摘要: 为了准确检测复杂背景下的显著区域,提出一种全卷积神经网络与低秩稀疏分解相结合的显著性检测方法,将图像分解为代表背景的低秩矩阵和对应显著区域的稀疏噪声,结合利用全卷积神经网络学习得到的高层语义先验知识,检测图像中的显著区域.首先,对原图像进行超像素聚类,并提取每个超像素的颜色、纹理和边缘特征,据此构成特征矩阵;然后,在MSRA数据库中,基于梯度下降法学习得到特征变换矩阵,利用全卷积神经网络学习得到高层语义先验知识;接着,利用特征变换矩阵和高层语义先验知识矩阵对特征矩阵进行变换;最后,利用鲁棒主成分分析算法对变换后的矩阵进行低秩稀疏分解,并根据分解得到的稀疏噪声计算显著图.在公开数据集上进行实验验证,并与当前流行的方法进行对比,实验结果表明,本文方法能够准确地检测感兴趣区域,是一种有效的自然图像目标检测与分割的预处理方法.Abstract: A unified saliency detection approach via the full convolution neural network (FCNN) and the low rank sparse decomposition is proposed to accurately detect the salient region in complex backgrounds. An image can be decomposed into a low rank matrix and sparse noise, indicating background and salient region, respectively. The high-level semantic prior knowledge learned by using the full convolution neural network is combined to detect the salient region in the image. Firstly, the original image is clustered into super pixels, and the feature matrix is constructed by extracting color, texture and edge features of each super pixel. Then, the feature transformation matrix is learned with the gradient descent method and the high-level semantic prior knowledge is learned with the full convolution neural network by using the MSRA database. Furthermore, the feature matrix is transformed using the feature transformation matrix and the high-level semantic prior knowledge matrix. Finally, the transformed feature matrix is decomposed into a low rank matrix and a sparse matrix by the robust principal component analysis method, and the saliency map is calculated according to the sparse matrix. The proposed method is compared with state-of-the-art algorithms on the open datasets. Experimental results demonstrate that the proposed algorithm can accurately detect the region of interest, which is an effective preprocessing means for object detection and segmentation of natural images.
-
Key words:
- Saliency detection /
- full convolution neural network (FCNN) /
- low rank sparse decomposition /
- high-level semantic prior knowledge
1) 本文责任编委 左旺孟 -
表 1 本文方法与传统方法的MAE比较
Table 1 The comparison of MAE between the proposed method and traditional methods
算法 MSRA-test1000 PASCAL-S FT 0.2480 0.3066 SR 0.2383 0.2906 CA 0.2462 0.2994 SF 0.1449 0.2534 GR 0.2524 0.2992 MR 0.1855 0.2283 BSCA 0.1859 0.2215 LRMR 0.2442 0.2759 本文算法 0.0969 0.1814 表 2 本文方法与其他方法的平均运行时间比较
Table 2 The comparison of average running time between the proposed method and other methods
算法 时间(s) 代码类型 MSRA-test1000 PASCAL-S FT 0.080 0.111 MATLAB SR 0.024 0.030 MATLAB CA 20.587 22.299 MATLAB SF 0.138 0.217 MATLAB GR 0.636 0.905 MATLAB MR 0.559 0.759 MATLAB BSCA 1.101 1.475 MATLAB LRMR 7.288 9.674 MATLAB 本文方法 6.916 9.154 MATLAB 表 3 FCNN分割的前景目标与本文最终分割得到的二值感兴趣区域的MAE比较
Table 3 The comparison of MAE between the segmented foreground object by FCNN and the segmented binary ROI by the proposed method
算法 MSRA-test1000 PASCAL-S FCNN高层先验知识 0.0531 0.1040 本文方法(二值化) 0.0516 0.0964 表 4 本文方法与深度学习方法的指标比较
Table 4 The comparison of evaluation indexs between the proposed method and deep learning methods
算法 F-measure MAE RFCN 0.7468 - DS 0.7710 0.1210 本文方法 0.7755 0.1814 -
[1] Mahadevan V, Vasconcelos N. Saliency-based discriminant tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009. 1007-1013 [2] Siagian C, Itti L. Rapid biologically-inspired scene classification using features shared with visual attention. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2007, 29(2):300-312 doi: 10.1109/TPAMI.2007.40 [3] 钱生, 陈宗海, 林名强, 张陈斌.基于条件随机场和图像分割的显著性检测.自动化学报, 2015, 41(4):711-724 http://www.aas.net.cn/CN/abstract/abstract18647.shtmlQian Sheng, Chen Zong-Hai, Lin Ming-Qiang, Zhang Chen-Bin. Saliency detection based on conditional random field and image segmentation. Acta Automatica Sinica, 2015, 41(4):711-724 http://www.aas.net.cn/CN/abstract/abstract18647.shtml [4] Sun J, Ling H B. Scale and object aware image retargeting for thumbnail browsing. In: Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 1511-1518 [5] 张慧, 王坤峰, 王飞跃.深度学习在目标视觉检测中的应用进展与展望.自动化学报, 2017, 43(8):1289-1305 http://www.aas.net.cn/CN/abstract/abstract19104.shtmlZhang Hui, Wang Kun-Feng, Wang Fei-Yue. Advances and perspectives on applications of deep learning in visual object detection. Acta Automatica Sinica, 2017, 43(8):1289-1305 http://www.aas.net.cn/CN/abstract/abstract19104.shtml [6] Marchesotti L, Cifarelli C, Csurka G. A framework for visual saliency detection with applications to image thumbnailing. In:Proceedings of the IEEE 12th International Conference on Computer Vision. Kyoto, Japan:IEEE, 2009. 2232-2239 [7] Yang J M, Yang M H. Top-down visual saliency via joint CRF and dictionary learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(3):576-588 doi: 10.1109/TPAMI.2016.2547384 [8] Li J, Rajan D, Yang J. Locality and context-aware top-down saliency. IET Image Processing, 2018, 12(3):400-407 doi: 10.1049/iet-ipr.2017.0251 [9] Itti L, Kouch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259 doi: 10.1109/34.730558 [10] Hou X D, Zhang L Q. Saliency detection: a spectral residual approach. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Minneapolis, MN, USA: IEEE, 2007. 1-8 [11] Achanta R, Hemami S, Estrada F, Susstrunk S. Frequency-tuned salient region detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE, 2009. 1597-1604 [12] Cheng M M, Zhang G X, Mitra N J, Huang X L, Hu S M. Global contrast based salient region detection. In: Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI: IEEE, 2011. 409-416 [13] Perazzi F, Krähenbühl P, Pritch Y, Hornung A. Saliency filters: contrast based filtering for salient region detection. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012. 733-740 [14] Goferman S, Zelnikmanor L, Tal A. Context-aware saliency detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(10):1915-1926 doi: 10.1109/TPAMI.2011.272 [15] Yang C, Zhang L H, Lu H C. Graph-regularized saliency detection with convex-hull-based center prior. IEEE Signal Processing Letters, 2013, 20(7):637-640 doi: 10.1109/LSP.2013.2260737 [16] Yang C, Zhang L H, Lu H C, Ruan X, Yang M H. Saliency detection via graph-based manifold ranking. In: Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition. Portland OR, USA: IEEE, 2013. 3166-3173 [17] Qin Y, Lu H C, Xu Y Q, Wang H. Saliency detection via cellular automata. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA: IEEE, 2015. 110-119 [18] Yan J C, Zhu M Y, Liu H X, Liu Y C. Visual saliency detection via sparsity pursuit. IEEE Signal Processing Letters, 2010, 17(8):739-742 doi: 10.1109/LSP.2010.2053200 [19] Lang C Y, Liu G C, Yu J, Yan S C. Saliency detection by multitask sparsity pursuit. IEEE Transactions on Image Processing, 2012, 21(3):1327-1338 doi: 10.1109/TIP.2011.2169274 [20] Shen X H, Wu Y. A unified approach to salient object detection via low rank matrix recovery. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence RI, USA: IEEE, 2012. 853-860 [21] 李岳云, 许悦雷, 马时平, 史鹤欢.深度卷积神经网络的显著性检测.中国图象图形学报, 2016, 21(1):53-59 http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201601007Li Yue-Yun, Xu Yue-Lei, Ma Shi-Ping, Shi He-Huan. Saliency detection based on deep convolutional neural network. Journal of Image and Graphics, 2016, 21(1):53-59 http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201601007 [22] Wang L Z, Wang L J, Lu H C, Zhang P P, Ruan X. Saliency detection with recurrent fully convolutional networks. In: Proceedings of the Computer Vision-ECCV 2016. Lecture Notes in Computer Science, vol. 9908. Amsterdam, Netherlands: Springer, 2016. 825-841 [23] Lee G, Tai Y W, Kim J. Deep saliency with encoded low level distance map and high level features. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016. 660-668 [24] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. In: Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston MA, USA: IEEE, 2015. 3431-3440 [25] Simoncelli E P, Freeman W T. The steerable pyramid: a flexible architecture for multi-scale derivative computation. In: Proceedings of International Conference on Image Processing. Washington DC, USA: IEEE, 1995. 444-447 [26] 王晓红, 赵于前, 廖苗, 邹北骥.基于多尺度2D Gabor小波的视网膜血管自动分割.自动化学报, 2015, 41(5):970-980 http://www.aas.net.cn/CN/abstract/abstract18671.shtmlWang Xiao-Hong, Zhao Yu-Qian, Liao Miao, Zou Bei-Ji. Automatic segmentation for retinal vessel based on multi-scale 2D Gabor wavelet. Acta Automatica Sinica, 2015, 41(5):970-980 http://www.aas.net.cn/CN/abstract/abstract18671.shtml [27] Comaniciu D, Meer P. Mean shift:a robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(5):603-619 doi: 10.1109/34.1000236 [28] Dong C, Loy C C, He K M, Tang X O. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2):295-307 doi: 10.1109/TPAMI.2015.2439281 [29] Matan O, Burges C J C, LeCun Y, Denker J S. Multi-digit recognition using a space displacement neural network. In: Proceedings of Neural Information Processing Systems. San Mateo, CA: Morgan Kaufmann, 1992. 488-495 [30] Wright J, Peng Y G, Ma Y, Ganesh A, Rao S. Robust principal component analysis: exact recovery of corrupted low-rank matrices by convex optimization. In: Proceedings of Neural Information Processing Systems. Vancouver, British Columbia, Canada: NIPS, 2009. 2080-2088