2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

L2损失大规模线性非平行支持向量顺序回归模型

石勇 李佩佳 汪华东

石勇, 李佩佳, 汪华东. L2损失大规模线性非平行支持向量顺序回归模型. 自动化学报, 2019, 45(3): 505-517. doi: 10.16383/j.aas.2018.c170438
引用本文: 石勇, 李佩佳, 汪华东. L2损失大规模线性非平行支持向量顺序回归模型. 自动化学报, 2019, 45(3): 505-517. doi: 10.16383/j.aas.2018.c170438
SHI Yong, LI Pei-Jia, WANG Hua-Dong. L2-loss Large-scale Linear Nonparallel Support Vector Ordinal Regression. ACTA AUTOMATICA SINICA, 2019, 45(3): 505-517. doi: 10.16383/j.aas.2018.c170438
Citation: SHI Yong, LI Pei-Jia, WANG Hua-Dong. L2-loss Large-scale Linear Nonparallel Support Vector Ordinal Regression. ACTA AUTOMATICA SINICA, 2019, 45(3): 505-517. doi: 10.16383/j.aas.2018.c170438

L2损失大规模线性非平行支持向量顺序回归模型

doi: 10.16383/j.aas.2018.c170438
基金项目: 

国家自然科学基金 71331005

国家自然科学基金 71110107026

国家自然科学基金 91546201

详细信息
    作者简介:

    石勇  中国科学院大学教授.于1991在美国堪萨斯大学商学院获得管理科学和计算机系统专业博士学位.主要研究方向为数据挖掘和多目标决策分析.E-mail:yshi@ucas.ac.cn

    汪华东  北京三星通信研究院助理研究员.于2017年7月在中国科学院大学数学科学学院获博士学位.于2014~2017年在中国科学院虚拟经济与数据科学研究中心学习.主要研究方向为支持向量机, 深度学习, 优化理论及应用和数据挖掘.E-mail:wanghuadong14@mails.ucas.ac.cn

    通讯作者:

    李佩佳  中国科学院大学计算机与控制学院博士研究生.目前在中国科学院虚拟经济与数据科学研究中心学习.她于2013年获得河南师范大学工学学士学位.主要研究方向为数据挖掘, 深度学习和自然语言处理.本文通信作者.E-mail:lipeijia13@mails.ucas.ac.cn

L2-loss Large-scale Linear Nonparallel Support Vector Ordinal Regression

Funds: 

Supported by National Natural Science Foundation of China 71331005

Supported by National Natural Science Foundation of China 71110107026

Supported by National Natural Science Foundation of China 91546201

More Information
    Author Bio:

    Professor at University of Chinese Academy of Sciences. He received his Ph. D. degree in management science and computer systems from University of Kansas, Lawrence, KS, USA, in 1991. His research interest covers data mining and multiple criteria decision making

    Assistant research fellow at Samsung R & D Institute China – Beijing. He received the Ph. D. degree in July, 2017 from the School of Mathematica Science, University of Chinese Academy. He studied at the Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences from 2014 to 2017. His research interest covers support vector machines, machine learning, optimization theory and applications and data mining

    Corresponding author: LI Pei-Jia Ph. D. candidate at the School of Computer and Control Engineering, University of Chinese Academy of Sciences. She is also studying at the Research Center on Fictitious Economy and Data Science, Chinese Academy of Sciences. She received her bachelor degree from Henan Normal University in 2013. Her research interest covers data mining, deep learning and natural language processing. Corresponding author of this paper
  • 摘要: 顺序回归是一种标签具有序信息的多分类问题,广泛存在于信息检索、推荐系统、情感分析等领域.随着互联网、移动通信等技术的发展,面对大量具有大规模、高维、稀疏等特征的数据,传统的顺序回归算法往往表现不足.非平行支持向量顺序回归模型具有适应性强,在性能上优于其他基于SVM的方法等优点,该文在此模型基础上提出基于L2损失的大规模线性非平行支持向量顺序回归模型,其中线性模型的设计可处理大规模数据,基于L2的损失可使标签偏离较大的样本得到更大惩罚.此外,该文从模型的两种不同角度分别设计了信赖域牛顿算法和坐标下降算法求解该线性模型,并比较了两种算法在性能上的差异.为验证模型的有效性,该文在大量数据集上对提出的模型及算法进行了分析,结果表明,该文提出的模型表现最优,尤其采用坐标下降算法求解的该模型在数据集上获得了最好的测试性能.
    1)  本文责任编委 何海波
  • 图  1  非平行支持向量顺序回归的几何解释(以类别2超平面构建为例)

    Fig.  1  Geometric interpretation of NPSVOR (It shows the construction of the $2$-th proximal hyperplane)

    图  2  TRON, TRON (WS), DCD-M and DCD-Mm在8个数据集上的比较(这里展示了类别3对应的优化问题).横坐标是时间$t$, 纵坐标为L2-NPSVOR原问题目标函数的$f(\boldsymbol{w}^t)-f(\boldsymbol{w}^*)$的值

    Fig.  2  Comparison of TRON, TRON (WS), DCD-M and DCD-Mm on eight datasets (Show the optimization model for rank 3). The horizontal axis is training time in seconds and the vertical axis is the difference between $f(\boldsymbol{w}^t)$ and $f(\boldsymbol{w}^*$)

    图  3  L1/L2-NPSVOR的MAE分别随参数$C$变化

    Fig.  3  Test MAE results of L1/L2-NPSVOR change with parameter $C$ on eight datasets

    图  4  L1/L2-NPSVOR的MSE分别随参数$C$变化

    Fig.  4  Test MSE results of L1/L2-NPSVOR change with parameter $C$ on eight datasets

    表  1  数据集特征描述

    Table  1  Data statistics

    数据集样本($n$)特征($m$)非零元素个数类别类别分布
    AmazonMp310 39165 4871 004 4355≈ 2 078
    VideoSurveillance22 281119 7931 754 0925≈ 4 456
    Tablets35 166201 0613 095 6635≈ 7 033
    Mobilephone69 891265 4325 041 8945≈ 13 978
    Cameras138 011654 26814 308 6765≈ 27 602
    TripAdvisor65 326404 7788 687 5615≈ 13 065
    Treebank11 8568 56998 8835≈ 2 371
    MovieReview5 00755 020961 3794≈ 1 251
    LargeMovie50 000309 3626 588 1928≈ 6 250
    Electronics409 0411 422 18137 303 2595≈ 81 808
    HealthCare82 251283 5425 201 7945≈ 16 450
    AppsAndroid220 566253 9326 602 5225≈ 44 113
    HomeKitchen120 856427 5588 473 4655≈ 24 171
    下载: 导出CSV

    表  2  方法在各数据集上测试结果, 包括MAE、MSE和最优参数下的训练时间(s)

    Table  2  Test results for each dataset and method, including MAE, MSE and training time (s)

    数据集指标L1-SVCL2-SVCSVRRedSVMNPSVORL2-NPSVOR (TRON)L2-NPSVOR (DCD)
    AmazonMp3MAE0.5640.5570.5340.5350.4880.4810.481
    MSE0.9960.9870.7320.7350.6990.6700.683
    TIME0.2090.6600.0310.1860.1650.8300.144
    VideoSurveillanceMAE0.4040.3910.4260.4460.3760.3710.372
    MSE0.7090.6680.5780.5920.5110.4930.491
    TIME0.4331.7080.0870.5510.4921.9960.402
    TabletsMAE0.3060.2990.3340.3460.2800.2780.278
    MSE0.5140.4960.4440.4440.3730.3620.363
    TIME0.8213.4000.1981.0290.9482.9580.674
    MobilephoneMAE0.4310.4190.4500.4440.3910.3880.385
    MSE0.7360.7050.6040.5870.5360.5240.522
    TIME1.8117.5740.3531.9092.3306.7241.605
    CamerasMAE0.2460.2400.2730.3010.2270.2320.226
    MSE0.3940.3810.3570.3750.2960.2990.298
    TIME9.55234.4801.4016.0166.34130.1325.388
    TripAdvisorMAE0.3980.3880.4330.4290.3650.3650.366
    MSE0.6110.5830.5390.5230.4450.4490.452
    TIME2.33112.7780.8072.1102.8579.2383.505
    TreebankMAE0.9070.8410.7840.7520.7630.8060.756
    MSE1.6521.4551.1161.0491.1261.2291.068
    TIME0.0250.0400.0040.0150.0260.0350.024
    MovieReviewMAE0.5010.4900.4480.4470.4320.4360.431
    MSE0.6150.5820.4860.4850.4760.4760.475
    TIME0.1210.4290.0290.1330.1300.3730.125
    LargeMovieMAE1.2051.1761.1821.0930.9921.0081.002
    MSE3.6173.5022.4692.2252.0462.0752.020
    TIME3.31110.4160.3281.9652.5697.5232.493
    ElectronicsMAE0.5920.5900.6060.6200.5290.5260.520
    MSE1.0691.0500.8400.8480.7470.7360.731
    TIME22.316168.1414.87810.73623.075116.58618.062
    HealthCareMAE0.6370.6210.6600.6810.5910.5900.589
    MSE1.3381.2821.0041.0620.9450.9200.929
    TIME2.0987.4290.4392.6862.9546.3652.673
    AppsAndroidMAE0.6400.6160.6560.6590.5840.5900.584
    MSE1.1791.1060.9220.9200.8440.8720.859
    TIME4.04314.9240.6341.6034.57411.4236.290
    HomeKitchenMAE0.5850.5740.5970.6090.5190.5190.510
    MSE1.0501.0150.8290.8420.7450.7230.720
    TIME5.58719.3930.8961.7865.17119.5604.475
    平均排序MAE5.644.575.645.862.502.211.57
    MSE7.006.004.364.292.362.291.64
    TIME3.576.791.003.294.076.213.07
    下载: 导出CSV
  • [1] Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V. SemEval-2016 task 4:sentiment analysis in Twitter. In:Proceedings of the 10th International Workshop on Semantic Evaluation. San Diego, CA, USA:ACL, 2016. 1-18
    [2] Dikkers H, Rothkrantz L. Support vector machines in ordinal classification:An application to corporate credit scoring. Neural Network World, 2005, 15(6):491 http://cn.bing.com/academic/profile?id=24de1d6567f75a19b2e740ee74907f6e&encoded=0&v=paper_preview&mkt=zh-cn
    [3] Chang K Y, Chen C S, Hung Y P. Ordinal hyperplanes ranker with cost sensitivities for age estimation. In:Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI, USA:IEEE, 2011. 585-592 https://www.researchgate.net/publication/224254798_Ordinal_hyperplanes_ranker_with_cost_sensitivities_for_age_estimation
    [4] Gutiérrez P A, Pérez-Ortiz M, Sánchez-Monedero J, Fernández-Navarro F, Hervás-Martínez C. Ordinal regression methods:survey and experimental study. IEEE Transactions on Knowledge and Data Engineering, 2016, 28(1):127-146 doi: 10.1109/TKDE.2015.2457911
    [5] 张学工.关于统计学习理论与支持向量机.自动化学报, 2000, 26(1):32-42 doi: 10.3969/j.issn.1003-8930.2000.01.008

    Zhang Xue-Gong. Introduction to statistical learning theory and support vector machines. Acta Automatica Sinica, 2000, 26(1):32-42 doi: 10.3969/j.issn.1003-8930.2000.01.008
    [6] Chu W, Keerthi S S. Support vector ordinal regression. Neural Computation, 2007, 19(3):792-815 doi: 10.1162/neco.2007.19.3.792
    [7] Lin H T, Li L. Reduction from cost-sensitive ordinal ranking to weighted binary classification. Neural Computation, 2012, 24(5):1329-1367 doi: 10.1162/NECO_a_00265
    [8] Pérez-Ortiz M, Gutiérrez P A, Hervás-Martínez C. Projection-based ensemble learning for ordinal regression. IEEE Transactions on Cybernetics, 2014, 44(5):681-694 doi: 10.1109/TCYB.2013.2266336
    [9] Chang K W, Hsieh C J, Lin C J. Coordinate descent method for large-scale L2-loss linear support vector machines. The Journal of Machine Learning Research, 2008, 9:1369-1398 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=CC0210084282
    [10] Wang H D, Shi Y, Niu L F, Tian Y J. Nonparallel support vector ordinal regression. IEEE Transactions on Cybernetics, 2017, 47(10):3306-3317 doi: 10.1109/TCYB.2017.2682852
    [11] Hsieh C J, Chang K W, Lin C J, Keerthi S S, Sundararajan S. A dual coordinate descent method for large-scale linear SVM. In:Proceedings of the 25th International Conference on Machine Learning. New York, USA:ACM, 2008. 408-415 https://www.researchgate.net/publication/215601307_A_Dual_Coordinate_Descent_Method_for_Large-scale_Linear_SVM
    [12] Ho C H, Lin C J. Large-scale linear support vector regression. The Journal of Machine Learning Research, 2012, 13(1):3323-3348 http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ0231088023/
    [13] Lin C J, Moré J J. Newton's method for large bound-constrained optimization problems. SIAM Journal on Optimization, 1999, 9(4):1100-1127 doi: 10.1137/S1052623498345075
    [14] Lin C J, Weng R C, Keerthi S S. Trust region newton method for logistic regression. The Journal of Machine Learning Research, 2008, 9:627-650 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=CC024921794
    [15] Hsia C Y, Zhu Y, Lin C J. A study on trust region update rules in newton methods for large-scale linear classification. In:Proceedings of the 9th Asian Conference on Machine Learning (ACML). Seoul, South Korea:ACML, 2017
    [16] Chiang W L, Lee M C, Lin C J. Parallel Dual coordinate descent method for large-scale linear classification in multi-core environments. In:Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA:ACM, 2016. 1485-1494 https://www.researchgate.net/publication/310825079_Parallel_Dual_Coordinate_Descent_Method_for_Large-scale_Linear_Classification_in_Multi-core_Environments
    [17] Yuan G X, Chang K W, Hsieh C J, Lin C J. A comparison of optimization methods and software for large-scale l1-regularized linear classification. The Journal of Machine Learning Research, 2010, 11:3183-3234 http://cn.bing.com/academic/profile?id=37aed315925f78d33806d3741e753dbb&encoded=0&v=paper_preview&mkt=zh-cn
    [18] Tseng P, Yun S. A coordinate gradient descent method for nonsmooth separable minimization. Mathematical Programming, 2009, 117(1-2):387-423 doi: 10.1007/s10107-007-0170-0
    [19] Joachims T. Making Large-scale SVM Learning Practical, Technical Report, SFB 475:Komplexitätsreduktion in Multivariaten Datenstrukturen, Universität Dortmund, Germany, 1998
    [20] Wang H N, Lu Y, Zhai C X. Latent aspect rating analysis on review text data:a rating regression approach. In:Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC, USA:ACM, 2010. 783-792 https://www.researchgate.net/publication/221653225_Latent_Aspect_Rating_Analysis_on_Review_Text_Data_A_Rating_Regression_Approach
    [21] Pang B, Lee L. Seeing stars:exploiting class relationships for sentiment categorization with respect to rating scales. In:Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics. Ann Arbor, USA:ACL, 2005. 115-124
    [22] McAuley J, Targett C, Shi Q F, van den Hengel A. Image-based recommendations on styles and substitutes. In:Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval. Santiago, Chile:ACM, 2015. 43-52 https://www.researchgate.net/publication/278734421_Image-Based_Recommendations_on_Styles_and_Substitutes
    [23] McAuley J, Pandey R, Leskovec J. Inferring networks of substitutable and complementary products. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney, Australia:ACM, 2015. 785-794 https://dl.acm.org/citation.cfm?id=2783381
    [24] Tang D Y, Qin B, Liu T. Document modeling with gated recurrent neural network for sentiment classification. In:Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal:ACL, 2015. 1422-1432
    [25] Diao Q M, Qiu M H, Wu C Y, Smola A J, Jiang J, Wang C. Jointly modeling aspects, ratings and sentiments for movie recommendation (JMARS). In:Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA:ACM, 2014. 193-202
  • 加载中
图(4) / 表(2)
计量
  • 文章访问数:  2233
  • HTML全文浏览量:  315
  • PDF下载量:  419
  • 被引次数: 0
出版历程
  • 收稿日期:  2017-08-01
  • 录用日期:  2017-10-30
  • 刊出日期:  2019-03-20

目录

    /

    返回文章
    返回