2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于自编码器及超图学习的多标签特征提取

唐朝辉 朱清新 洪朝群 祝峰

唐朝辉, 朱清新, 洪朝群, 祝峰. 基于自编码器及超图学习的多标签特征提取. 自动化学报, 2016, 42(7): 1014-1021. doi: 10.16383/j.aas.2016.c150736
引用本文: 唐朝辉, 朱清新, 洪朝群, 祝峰. 基于自编码器及超图学习的多标签特征提取. 自动化学报, 2016, 42(7): 1014-1021. doi: 10.16383/j.aas.2016.c150736
TANG Chao-Hui, ZHU Qing-Xin, HONG Chao-Qun, ZHU William. Multi-label Feature Selection with Autoencoders and Hypergraph Learning. ACTA AUTOMATICA SINICA, 2016, 42(7): 1014-1021. doi: 10.16383/j.aas.2016.c150736
Citation: TANG Chao-Hui, ZHU Qing-Xin, HONG Chao-Qun, ZHU William. Multi-label Feature Selection with Autoencoders and Hypergraph Learning. ACTA AUTOMATICA SINICA, 2016, 42(7): 1014-1021. doi: 10.16383/j.aas.2016.c150736

基于自编码器及超图学习的多标签特征提取

doi: 10.16383/j.aas.2016.c150736
基金项目: 

国家自然科学基金 61379049

国家自然科学基金 61573297

国家自然科学基金 61472110

国家自然科学基金 61300192

中央高校基本科研项目 ZYGX2014J052

福建省自然科学基金 2014J01256

详细信息
    作者简介:

    唐朝辉 电子科技大学信息与软件工程学院博士研究生.2009年获得湖南大学硕士学位.主要研究方向为机器学习与计算机视觉.E-mail:chhtang@xmut.edu.cn

    朱清新 电子科技大学信息与软件工程学院教授.1993年获得渥太华大学博士学位.主要研究方向为生物信息学, 信息检索.E-mail:qxzhu@uestc.edu.cn

    洪朝群 厦门理工学院计算机与信息工程学院副教授.2011年获得浙江大学博士学位.主要研究方向为计算机视觉与机器学习.E-mail:cqhong@xmut.edu.cn

    通讯作者:

    祝峰 闽南师范大学教授.2006年获得奥克兰大学博士学位.主要研究方向为数据挖掘与人工智能.本文通信作者.E-mail:williamfengzhu@gmail.com

Multi-label Feature Selection with Autoencoders and Hypergraph Learning

Funds: 

Supported by National Natural Science Foundation of China 61379049

Supported by National Natural Science Foundation of China 61573297

Supported by National Natural Science Foundation of China 61472110

Supported by National Natural Science Foundation of China 61300192

Fundamental Research Funds for the Central Universities ZYGX2014J052

Natural Science Foundation of Fujian Province 2014J01256

More Information
    Author Bio:

    Ph. D. candidate at the School of Information and Software Engineering, University of Electronic Science and Technology of China. He received his master degree from Hunan University in 2009. His research interest covers machine learing and computer vision

    Professor at the School of Information and Software Engineering, University of Electronic Science and Technology of China. He received his Ph. D. degree from University of Ottawa in 1993. His research interest covers bioinformatics and information retrieval.

    Associate professor at the School of Computer and Information Engineering, Xiamen University of Technology. He received his Ph. D. degree from Zhejiang University in 2011. His research interest covers computer vision and machine learning

    Corresponding author: ZHU William Professor at Minnan Normal University. He received his Ph. D. degree from Oakland University in 2006. His research interest covers data mining and artificial intelligence. Corresponding author of this paper
  • 摘要: 在实际应用场景中越来越多的数据具有多标签的特性,且特征维度较高,包含大量冗余信息.为提高多标签数据挖掘的效率,多标签特征提取已经成为当前研究的热点.本文采用去噪自编码器获取多标签数据特征空间的鲁棒表达,在此基础上结合超图学习理论,融合多个标签对样本间几何关系的影响以提升特征提取的性能,构建多标签数据样本间几何关系所对应超图的Laplacian矩阵,并通过Laplacian矩阵的特征值分解得到低维投影空间.实验结果证明了本文所提出的算法在分类性能上是有效可行的.
  • 图  1  参数ksAverageP recision的影响(kd=3)

    Fig.  1  The influences of ks to AverageP recision (kd=3)

    图  2  参数kdAveragePrecision的影响(ks=8)

    Fig.  2  The influences of kd to AveragePrecision (ks=8)

    表  1  重要的标记定义

    Table  1  Definitions of important notations

    标记 标记语义
    n 训练集中训练样本的个数
    u, V 顶点
    fRn 所有样本的得分向量
    f(u)或者fu 样本u的得分函数
    e, E 超边,超边集合
    δ(e) 超边e的度
    d(u) 顶点u的度
    Dv 顶点集的度矩阵
    De 超边集的度矩阵
    W(e) 超边e的权重
    H 超图对应的邻接矩阵
    Ω Ω(i, i)是第i条超边的权重,其他取0
    r 约简后的特征维度
    I 超图学习前的原特征空间
    S 超图学习后的语义特征空间
    Pi 基于样本xi的局部批
    Pi 基于样本xi的局部批特征投影矩阵
    下载: 导出CSV

    表  2  算法空间复杂度

    Table  2  Space consumption

    矩阵 空间复杂度
    Dv |V|×|V|
    De |E|×|E|
    H |E|×|V|
    Ω |E|×|E|
    Lg |E|×|V|
    下载: 导出CSV

    表  3  数据集信息

    Table  3  Information of data sets

    编号 名称 样本数 特征数 标签数
    1 Emotions 593 72 6
    2 Yeast 2417 103 14
    3 Scene 2 407 294 6
    4 Birds 645 260 19
    5 Computer 5 000 681 33
    下载: 导出CSV

    表  4  数据集Emotions测试结果(params=(8, 3, 15))

    Table  4  Results on Emotions (params=(8, 3, 15))

    指标 a0 a1 a2 a3 MLFS-AH
    OE 0.290 0.277 0.489 0.265 0.256
    Cov 1.893 1.842 2.791 1.733 1.756
    RL 0.173 0.181 0.349 0.168 0.152
    AP 0.770 0.784 0.658 0.811 0.825
    下载: 导出CSV

    表  5  数据集Yeast测试结果(params=(7, 3, 30))

    Table  5  Results on Yeast (params=(7, 3, 30))

    指标 a0 al a2 a3 MLFS-AH
    OE 0.283 0.274 0.289 0.268 0.243
    Cov 6.452 6.331 6.538 6.245 6.121
    RL 0.174 0.168 0.203 0.156 0.160
    AP 0.760 0.758 0.717 0.782 0.811
    下载: 导出CSV

    表  6  数据集Scene测试结果(params=(8, 3, 20))

    Table  6  Results on Scene (params=(8, 3, 20))

    指标 a0 a1 a2 a3 MLFS-AH
    OE 0.275 0.261 0.318 0.260 0.248
    Cov 0.573 0.536 0.619 0.425 0.429
    RL 0.163 0.165 0.230 0.166 0.157
    AP 0.776 0.795 0.723 0.792 0.815
    下载: 导出CSV

    表  7  数据集Birds测试结果(params=(8, 2, 45))

    Table  7  Results on Birds (params=(8, 2, 45))

    指标 a0 a1 a2 a3 MLFS-AH
    OE 0.379 0.371 0.369 0.352 0.344
    Cov 3, 411 3.426 3.724 3.385 3.389
    RL 0.129 0.125 0.138 0.124 0.121
    AP 0.712 0.718 0.705 0.727 0.742
    下载: 导出CSV

    表  8  数据集Computer测试结果(params=(7, 2, 60))

    Table  8  Results on Computer (params=(7, 2, 60))

    指标 a0 a1 a2 a3 MLFS-AH
    OE 0.434 0.438 0.439 0.432 0.425
    Cov 4.435 4.439 4.532 4.378 4.339
    RL 0.108 0.104 0.106 0.089 0.091
    AP 0.641 0.643 0.647 0.649 0.675
    下载: 导出CSV
  • [1] Zhang Y, Zhou Z H. Multi-label dimensionality reduction via dependence maximization. In:Proceedings of the 23rd AAAI Conference on Artificial Intelligence. Chicago, USA:AAAI Press, 2008. 1503-1505
    [2] 付忠良.多标签代价敏感分类集成学习算法.自动化学报, 2014, 40(6):1075-1085 http://www.aas.net.cn/CN/abstract/abstract18377.shtml

    Fu Zhong-Liang. Cost-sensitive ensemble learning algorithm for multi-label classification problems. Acta Automatica Sinica, 2014, 40(6):1075-1085 http://www.aas.net.cn/CN/abstract/abstract18377.shtml
    [3] 张晨光, 张燕, 张夏欢.最大规范化依赖性多标记半监督学习方法.自动化学报, 2015, 41(9):1577-1588

    Zhang Chen-Guang, Zhang Yan, Zhang Xia-Huan. Normalized dependence maximization multi-label semi-supervised learning method. Acta Automatica Sinica, 2015, 41(9):1577-1588
    [4] Zhang M L, Zhang K. Multi-label learning by exploiting label dependency. In:Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Washington, USA:ACM, 2010. 999-1008
    [5] Hariharan B, Zelnik-Manor L, Vishwanathan S V N, Varma M. Large scale max-margin multi-label classification with priors. In:Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel:Omnipress, 2010. 423-430
    [6] Elisseeff A, Weston J. A kernel method for multi-labelled classification. In:Proleedings of the 2001 Advances in Neural Information Processing Systems 14. British Columbia, Canada:MIT Press, 2001. 681-687
    [7] Sun L, Ji S W, Ye J P. Hypergraph spectral learning for multi-label classification. In:Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, USA:ACM, 2008. 668-676
    [8] Zhang M L, Zhou Z H. A review on multi-label learning algorithms. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8):1819-1837 doi: 10.1109/TKDE.2013.39
    [9] Gibaja E, Ventura S. A tutorial on multi-label learning. ACM Computing Surveys, 2015, 47(3):Article No. 52
    [10] 田枫, 沈旭昆.基于标签集相关性学习的大规模网络图像在线标注.自动化学报, 2014, 40(8):1635-1643 http://www.aas.net.cn/CN/abstract/abstract18732.shtml

    Tian Feng, Shen Xu-Kun. Large scale web image online annotation by learning label set relevance. Acta Automatica Sinica, 2014, 40(8):1635-1643 http://www.aas.net.cn/CN/abstract/abstract18732.shtml
    [11] Boutell M R, Luo J B, Shen X P, Brown C M. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9):1757-1771 doi: 10.1016/j.patcog.2004.03.009
    [12] 张振海, 李士宁, 李志刚, 陈昊.一类基于信息熵的多标签特征选择算法.计算机研究与发展, 2013, 50(6):1177-1184 http://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ201306008.htm

    Zhang Zhen-Hai, Li Shi-Ning, Li Zhi-Gang, Chen Hao. Multi-label feature selection algorithm based on information entropy. Journal of Computer Research and Development, 2013, 50(6):1177-1184 http://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ201306008.htm
    [13] 段洁, 胡清华, 张灵均, 钱宇华, 李德玉.基于邻域粗糙集的多标记分类特征选择算法.计算机研究与发展, 2015, 52(1):56-65 http://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ201501007.htm

    Duan Jie, Hu Qing-Hua, Zhang Ling-Jun, Qian Yu-Hua, Li De-Yu. Feature selection for multi-label classification based on neighborhood rough sets. Journal of Computer Research and Development, 2015, 52(1):56-65 http://www.cnki.com.cn/Article/CJFDTOTAL-JFYZ201501007.htm
    [14] Sun L, Ji S W, Ye J P. Multi-label Dimensionality Reduction. Britain:Chapman and Hall/CRC Press, 2013. 34-49
    [15] Yu K, Yu S P, Tresp V. Multi-label informed latent semantic indexing. In:Proceedings of the 28th Annual International ACM SIGIR Conference on Research & Development in Information Retrieval. Salvador, Brazil:ACM, 2005. 258-265
    [16] Tao D C, Li X L, Wu X D, Maybank S J. General tensor discriminant analysis and Gabor features for gait recognition. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2007, 29(10):1700-1715 http://cn.bing.com/academic/profile?id=2154624311&encoded=0&v=paper_preview&mkt=zh-cn
    [17] Tao D C, Li X L, Wu X D, Maybank S J. Geometric mean for subspace selection. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2009, 31(2):260-274 http://cn.bing.com/academic/profile?id=2117513046&encoded=0&v=paper_preview&mkt=zh-cn
    [18] Zhou D Y, Huang J Y, Schölkopf B. Learning with hypergraphs:clustering, classification, and embedding. In:Proceedings of the 2007 Advances in Neural Information Processing Systems. Vancouver, Canada:MIT Press, 2007, 1601-1608
    [19] Berge C. Hypergraphs:Combinatorics of Finite Sets. Amsterdam:North-Holland, 1989. 83-96
    [20] Gao Y, Chua T S. Hyperspectral image classification by using pixel spatial correlation. In:Proceedings of the 19th International Conference on Advances in Multimedia Modeling. Huangshan, China:Springer, 2013. 141-151
    [21] Yu J, Tao D C, Wang M. Adaptive hypergraph learning and its application in image classification. IEEE Transactions on Image Processing, 2012, 21(7):3262-3272 doi: 10.1109/TIP.2012.2190083
    [22] Shi J B, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2000, 22(8):888-905 http://cn.bing.com/academic/profile?id=2121947440&encoded=0&v=paper_preview&mkt=zh-cn
    [23] Gao Y, Wang M, Tao D C, Ji R R, Dai Q H. 3-D object retrieval and recognition with hypergraph analysis. IEEE Transactions on Image Processing, 2012, 21(9):4290-4303 doi: 10.1109/TIP.2012.2199502
    [24] Hong C Q, Zhu J K. Hypergraph-based multi-example ranking with sparse representation for transductive learning image retrieval. Neurocomputing, 2013, 101:94-103 doi: 10.1016/j.neucom.2012.09.001
    [25] Chen M M, Weinberger K, Sha F, Bengio Y. Marginalized denoising auto-encoders for nonlinear representations. In:Proceedings of the 31st International Conference on Machine Learning. Beijing, China, 2014. 1476-1484
    [26] Zhang T H, Tao D C, Li X L, Yang J. Patch alignment for dimensionality reduction. IEEE Transactions on Knowledge and Data Engineering, 2009, 21(9):1299-1313 doi: 10.1109/TKDE.2008.212
    [27] Zhang M L, Zhou Z H. ML-KNN:a lazy learning approach to multi-label learning. Pattern Recognition, 2007, 40(7):2038-2048 doi: 10.1016/j.patcog.2006.12.019
    [28] Lee J, Kim D W. Fast multi-label feature selection based on information-theoretic feature ranking. Pattern Recognition, 2015, 48(9):2761-2771 doi: 10.1016/j.patcog.2015.04.009
  • 加载中
图(2) / 表(8)
计量
  • 文章访问数:  2530
  • HTML全文浏览量:  1066
  • PDF下载量:  1825
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-11-09
  • 录用日期:  2016-05-03
  • 刊出日期:  2016-07-01

目录

    /

    返回文章
    返回