基于密度估计的逻辑回归模型

毛毅; 陈稳霖; 郭宝龙; 陈一昕

doi:10.3724/SP.J.1004.2014.00062

基于密度估计的逻辑回归模型

doi: 10.3724/SP.J.1004.2014.00062

1.
西安电子科技大学机电工程学院智能控制与图像工程研究所西安 710071, 中国;
2.
圣路易华盛顿大学计算机工程学院圣路易 63130, 美国

基金项目:

国家自然科学基金（61105066，61201290，61305041，61305040）资助

详细信息

作者简介:
毛毅西安电子科技大学智能控制与图像工程研究所博士研究生. 2008 年获西安电子科技大学测控计量技术与仪器学士学位. 主要研究方向为数据挖掘与机器学习. 本文通信作者.E-mail：olivia.maoy@gmail.com

计量
- 文章访问数: 2008
- HTML全文浏览量: 117
- PDF下载量: 1652
- 被引次数: 0
出版历程
- 收稿日期: 2013-01-16
- 修回日期: 2013-04-02
- 刊出日期: 2014-01-20

A Novel Logistic Regression Model Based on Density Estimation

1.
Institute of Intelligent Control and Image Engineering, School of Electromechanical Engineering, Xidian University, Xi'an 710071, China;
2.
Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis 631 30, USA

Funds:

Supported by National Natural Science Foundation of China (61105066, 61201290, 61305041, 61305040)

摘要

摘要: 介绍了一种基于密度的逻辑回归（Density-based logistic regression，DLR）分类模型以解决逻辑回归中非线性分类的问题. 其主要思想是根据Nadarays-Watson密度估计将训练数据映射到特定的特征空间，然后组建优化模型优化特征权重以及Nadarays-Watson 密度估计算法的宽度. 其主要优点在于：它不仅优于标准的逻辑回归，而且优于基于径向基函数（Radial basis function，RBF）内核的核逻辑回归（Kernel logistic regression，KLR）. 特别是与核逻辑回归分析和支持向量机（Support vector machine，SVM）相比，该方法不仅达到更好的分类精度，而且有更好的时间效率. 该方法的另一个显著优点是，它可以很自然地扩展到数值类型和分类型混合的数据集中. 除此之外，该方法和逻辑回归（Logistic regression，LR）一样，有同样的模型可解释的优点，这恰恰是其他如核逻辑回归分析和支持向量机所不具备的.
- 非线性分类 /
- Nadarays-Watson密度估计 /
- 逻辑回归 /
- 核函数
Abstract: We propose a density-based logistic regression (DLR) model for classification to address the challenge of the nonlinear classification problem in this domain. Based on a Nadarays-Watson density estimator, the training data is mapped into a particular feature space. Then, an optimization model is set up to optimize the feature weights and the width in the Nadaraya-Watson density estimation algorithm. We show that it is superior to not only standard logistic regression but also kernel logistic regression (KLR) with radial basis function (RBF) kernels. The results show that DLR compares favorably against other nonlinear methods including KLR and support vector machine (SVM). The introduced approach achieves not only better classification accuracy but also better time efficiency. Another major advantage of our method is that it can be naturally extended to cope with hybrid data with both categorical features and numerical features. Moveover, our approach shares with logistic regression the same advantage of interpretability of the model, which is not obtained by kernel based methods such as KLR and SVM.
- Nonlinear classification /
- Nadarays-Watson density estimation /
- logistic regression (LR) /
- kernel function

HTML全文

参考文献(31)

[1]	Haralick R M, Shapiro L G. Computer and Robot Vision. Boston, MA: Addison-Wesley Longman Publishing Co., Inc. 1992
[2]	Jiang Li-Xing, Hou Jin. Image annotation using the ensemble learning. Acta Automatica Sinica, 2012, 38(8): 1257-1262
[3]	Chen Rong, Cao Yong-Feng, Sun Hong. Multi-class image classification with active learning and semi-supervised learning. Acta Automatica Sinica, 2011, 37(8): 954-962 (陈荣, 曹永锋, 孙洪. 基于主动学习和半监督学习的多类图像分类. 自动化学报, 2011, 37(8): 954-962)
[4]	Ting N. Dose Finding in Drug Development (Statistics for Biology and Health). New York: Springer, 2006
[5]	Wang Yao-Nan, Yuan Xiao-Fang. SVM approximate-based internal model control strategy. Acta Automatica Sinica, 2008, 34(2): 172-179
[6]	Bishop C M. Pattern Recognition and Machine Learning. New York: Springer-Verlag, 2006
[7]	Zhao Zhi-Gang, Lv Hui-Xian, Li Yu-Jing, Li Jing. A multi-classification SVM based on clustering idea. Journal of Qingdao Technological University, 2011, 32(1): 73-76 (赵志刚, 吕慧显, 李玉景, 李京. 一种基于聚类思想的SVM多类分类方法. 青岛理工大学学报, 2011, 32(1): 73-76)
[8]	Green P J, Yandell B S. Semi-parametric generalized linear models. In: Proceedings on the 2nd International GLIM Conference. New York: Springer-Verlag, 1985. 44-55
[9]	Zhu J, Hastie T. Kernel logistic regression and the import vector machine. Journal of Computational and Graphical Statistics, Cambridge, MA: MIT Press, 2001. 1081-1088
[10]	Hsu C W, Chang C C, Lin C J. TA Practical Guide to Support Vector Classification, Technical Report, Department of Computer Science and Information Engineering, National Taiwan University, Taipei, China, 2003
[11]	Mitchell T M. Machine Learning. New York: McGraw-Hill Inc., 1997
[12]	Keerthi S S, Duan B K, Shevade S K, Poo A N. A fast dual algorithm for kernel logistic regression. Machine Learning, 2005, 61(1-3): 151-165
[13]	Chapelle O, Vapnik V, Bousquet O, Mukherjee S. Choosing multiple parameters for support vector machines. Machine Learning, 2002, 46(1-3): 131-159
[14]	Gärtner T. A survey of kernels for structured data. ACM SIGKDD Explorations Newsletter, 2003, 5(1): 49-58
[15]	Maalouf M. Logistic regression in data analysis: an overview. International Journal of Data Analysis Techniques and Strategies, 2011, 3(3): 281-299
[16]	Wei Deng-Ping, Wang Ting, Wang Ji. A logistic regression model for semantic web service matchmaking. Science China Information Sciences, 2012, 55(7): 1715-1720
[17]	Zhang Z, Liu A, Lyles R H, Mukherjee B. Logistic regression analysis of biomarker data subject to pooling and dichotomization. Statistics in Medicine, 2012, 31(22): 2473-2484
[18]	Junek W N, Jones L W, Woods M T. Use of logistic regression for forecasting short-term volcanic activity. Algorithms, 2012, 5(4): 330-363
[19]	Dong J J, Tung Y H, Chen C C, Liao J J, Pan Y W. Logistic regression model for predicting the failure probability of a landslide dam. Engineering Geology, 2011, 117(1-2): 52-61
[20]	Das U, Maiti T, Pradhan V. Bias correction in logistic regression with missing categorical covariates. Journal of Statistical Planning and Inference, 2010, 140(9): 2478-2485
[21]	Bham G H, Javvadi B S, Manepalli U R R. Multinomial logistic regression model for single-vehicle and multivehicle collisions on urban US highways in Arkansas. Journal of Transportation Engineering, 2012, 138(6): 786-797
[22]	Santana R, Bielza C, Larrañaga P. Regularized logistic regression and multiobjective variable selection for classifying MEG data. Biological Cybernetics, 2012, 106(6-7): 389-405
[23]	Jaakkola T S, Haussler D. Probabilistic kernel regression models. In: Proceedings of the 1999 Conference on AI and Statistics. Key West, FL: Morgan Kaufmann, 1999. 1-9
[24]	Raina R, Shen Y R, Ng A Y, McCallum A. Classification with hybrid generative/discriminative models. In: Proceedings of the 2003 Advances in Neural Information Processing Systems. MIT Press, 2003. 280-289
[25]	Jaakkola T, Haussler D. Exploiting generative models in discriminative classifiers. In: Proceedings of the 1998 Advances in Neural Information Processing Systems 11. MIT Press, 1998. 487-493
[26]	Jaakkola T, Diekhans M, Haussler D. Using the Fisher kernel method to detect remote protein homologies. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology. Heidelberg: AAAI Press, 1999. 149-158
[27]	Sun Q, Li R X, Luo D S, Wu X H. Text segmentation with LDA-based Fisher kernel. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2008. 269 -272
[28]	Kashima H, Tsuda K, Inokuchi A. Marginalized kernels between labeled graphs. In: Proceedings of the 20th International Conference on Machine Learning. Heidelberg: AAAI Press, 2003. 321-328
[29]	Tsuda K, Kawanabe M, Rätsch G, Sonnenburg S, MÜller K R. A new discriminative kernel from probabilistic models. Neural Computation, 2002, 14(10) 2397-2414
[30]	Tu L, Chen Y X. Stream data clustering based on grid density and attraction. ACM Transactions on Knowledge Discovery from Data, 2009, 3(3): 12:1-12:27
[31]	Chen Y X, Tu L. Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-07). New York, USA: ACM, 2007. 133-142