-
摘要: 随着计算机和社交网络的飞速发展, 图像美感的自动评价产生了越来越大的需求并受到了广泛关注. 由于图像美感评价的主观性和复杂性, 传统的手工特征和局部特征方法难以全面表征图像的美感特点, 并准确量化或建模. 本文提出一种并行深度卷积神经网络的图像美感分类方法, 从同一图像的不同角度出发, 利用深度学习网络自动完成特征学习, 得到更为全面的图像美感特征描述; 然后利用支持向量机训练特征并建立分类器, 实现图像美感分类. 通过在两个主流的图像美感数据库上的实验显示, 本文方法与目前已有的其他算法对比, 获得了更好的分类准确率.Abstract: With the rapid development of computers and social networks, automatic image aesthetic evaluation is in demand and has attracted more and more attention recently. Since the complexity and subjectivity of image aesthetic evaluation task, the traditional handcrafted features and generic image descriptors are hard to represent the overall aesthetic character of images. It is difficult for them to quantify and model the image aesthetics exactly. In this paper, a new method of image classification based on parallel deep convolutional neural networks is proposed. We use parallel deep learning networks to automatically complete feature extraction and acquire more comprehensive description of image aesthetics from different views. Then a support vector machine (SVM) classifier is built with the aesthetic features to accomplish image aesthetic classification. Experiments on two most frequently used databases of image aesthetics demonstrate that our proposed method achieves better results than other exsiting methods.
-
表 1 不同结构单路卷积神经网络的分类准确率
Table 1 Classification accuracy of single column convolutional neural networks with different structures
全连接 Fc Fc Fc Fc Fc Fc 分类准确率 层设置 4096 2048 1024 512 256 2 (%) Arch1 √ √ √ 83.70 Arch2 √ √ √ √ 83.73 Arch3 √ √ √ √ √ 83.21 Arch4 √ √ √ √ √ √ 83.28 表 2 不同输入的单路卷积神经网络的分类准确率
Table 2 Classification accuracy of single column convolutional neural networks with different inputs
输入方式 分类准确率(%) Normal 83.28 Resize 80.28 H 70.03 S 75.90 V 82.99 Daubechies 81.60 表 3 各种特征组合方式的分类准确率
Table 3 Classification accuracy of various
输入组合 Normal Resize H S V Daubechies 特征维数 分类准确率 (%) 1 √ √ √ 768 83.93 2 √ 256 83.28 3 √ √ 512 83.66 4 √ √ 512 84.18 5 √ √ 512 85.00 6 √ √ √ 768 85.17 7 √ √ √ 768 85.33 8 √ √ √ 768 85.83 9 √ √ √ √ 1024 85.41 10 √ √ √ √ √ 1280 85.94 表 4 AVA1数据库的实验结果及与现有方法的对比
Table 4 The experimental results of the AVA1 datasets and comparison with existing methods
表 5 AVA2数据库的实验结果及与现有方法的对比
Table 5 The experimental results of the AVA2 datasets and comparison with existing methods
图像美感分类方法 分类准确率(%) RDCNN semantic[10] 75.42 本文方法 77.03 表 6 CUHKPQ 各类别图库和总图库的实验结果及现有方法的对比
Table 6 The experimental results of the CUHKPQ datasets and comparison with existing methods
特征类型 场景类别 Animal Architecture Human Landscape Night Plant Static Overall 手工特征 All features in [8]* 0.7751 0.8526 0.7908 0.8170 0.7321 0.8093 0.7829 0.7944 All features in [9] 0.8937 0.9275 0.9740 0.9468 0.8463 0.9182 0.9069 0.9209 局部特征 Semantic features[12] 0.8623 0.8644 0.9313 0.8416 0.8742 0.8685 0.8964 0.8787 Semantic features + handcrafted features[12] 0.9033 0.8755 0.9472 0.8853 0.9052 0.9232 0.9094 0.9093 深度学习方法 DCNN Aesth SP[16] - - - - - - - 0.9193 本文方法 0.9382 0.9113 0.9697 0.9100 0.9166 0.9410 0.9159 0.9395 * 此行数据引用自文献[12]的结果. -
[1] 王伟凝, 蚁静缄, 贺前华. 可计算图像美学研究进展. 中国图象图形学报, 2012,17(8): 893-901Wang Wei-Ning, Yi Jing-Jian, He Qian-Hua. Review for computational image aesthetics. Journal of Image and Graphics, 2012,17(8): 893-901 [2] 王伟凝, 刘剑聪, 徐向民, 姜怡孜, 王励. 基于构图规则的图像美学优化. 华南理工大学学报(自然科学版), 2015,43(5): 51-58Wang Wei-Ning, Liu Jian-Cong, Xu Xiang-Min, Jiang Yi-Zi, Wang Li. Aesthetic enhancement of images based on photography composition guidelines. Journal of South China University of Technology (Natural Science Edition), 2015,43(5): 51-58 [3] 王伟凝, 蚁静缄, 徐向民, 王励. 可计算的图像美学分类与评估. 计算机辅助设计与图形学学报, 2014,26(7): 1075-1083Wang Wei-Ning, Yi Jing-Jian, Xu Xiang-Min, Wang Li. Computational aesthetics of image classification and evaluation. Journal of Computer-Aided Design & Computer Graphics, 2014,26(7): 1075-1083 [4] Wang W N, Cai D, Wang L, Huang Q H, Xu X M, Li X L. Synthesized computational aesthetic evaluation of photos. Neurocomputing, 2016,172: 244-252 [5] Tong H H, Li M J, Zhang H J, He J R, Zhang C S. Classification of digital photos taken by photographers or home users. In: Proceedings of the 5th Pacific Rim Conference on Multimedia. Tokyo, Japan: Springer, 2004. 198-205 [6] Datta R, Joshi D, Li J, Wang J Z. Studying aesthetics in photographic images using a computational approach. In: Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006. 288-301 [7] Wang W N, Zhao W J, Cai C J, Huang J X, Xu X M, Li L. An efficient image aesthetic analysis system using Hadoop. Signal Processing: Image Communication, 2015,39: 499-508 [8] Ke Y, Tang X O, Jing F. The design of high-level features for photo quality assessment. In: Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE, 2006. 419-426 [9] Tang X O, Luo W, Wang X G. Content-based photo quality assessment. IEEE Transactions on Multimedia, 2013,15(8): 1930-1943 [10] Lu X, Lin Z, Jin H L, Yang J C, Wang J Z. Rating image aesthetics using deep learning. IEEE Transactions on Multimedia, 2015,17(11): 2021-2034 [11] Marchesotti L, Perronnin F, Larlus D, Csurka G. Assessing the aesthetic quality of photographs using generic image descriptors. In: Proceedings of the 2011 IEEE International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011. 1784-1791 [12] Guo L H, Xiong Y C, Huang Q H, Li X L. Image esthetic assessment using both hand-crafting and semantic features. Neurocomputing, 2014,143: 14-26 [13] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 26th Annual Conference on Neural Information Processing Systems 2012. Lake Tahoe, USA: Curran Associates, Inc., 2012. 1097-1105 [14] Sun Y, Wang X G, Tang X O. Deep learning face representation from predicting 10000 classes. In: Proceedings of the 27th IEEE Conference on Computer Vision and Pattern Recognition. Columbus, USA: IEEE, 2014. 1891-1898 [15] Lee H, Grosse R, Ranganath R, Ng A Y. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning. Montreal, Canada: ACM, 2009. 609-616 [16] Dong Z, Shen X, Li H Q, Tian X M. Photo quality assessment with DCNN that understands image well. In: Proceedings of the 21st International Conference on MultiMedia Modeling. Sydney, Australia: Springer International Publishing, 2015. 524-535 [17] Dong Z, Tian X M. Multi-level photo quality assessment with multi-view features. Neurocomputing, 2015,168: 308-319 [18] Deng J, Dong W, Socher R, Li L J, Li K, Li F F. ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE, 2009. 248-255 [19] Yin W, Mei T, Chen C W. Assessing photo quality with geo-context and crowdsourced photos. In: Proceedings of the 2012 IEEE Visual Communications and Image Processing. San Diego, USA: IEEE, 2012. 1-6 [20] Murray N, Marchesotti L, Perronnin F. AVA: a large-scale database for aesthetic visual analysis. In: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, USA: IEEE, 2012. 2408-2415 [21] Wang W N, Cai D, Xu X M, Liew A W C. Visual saliency detection based on region descriptors and prior knowledge. Signal Processing: Image Communication, 2014,29(3): 424-433