2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

稀疏贝叶斯混合专家模型及其在光谱数据标定中的应用

俞斌峰 季海波

俞斌峰, 季海波. 稀疏贝叶斯混合专家模型及其在光谱数据标定中的应用. 自动化学报, 2016, 42(4): 566-579. doi: 10.16383/j.aas.2016.c150255
引用本文: 俞斌峰, 季海波. 稀疏贝叶斯混合专家模型及其在光谱数据标定中的应用. 自动化学报, 2016, 42(4): 566-579. doi: 10.16383/j.aas.2016.c150255
YU Bin-Feng, JI Hai-Bo. Sparse Bayesian Mixture of Experts and Its Application to Spectral Multivariate Calibration. ACTA AUTOMATICA SINICA, 2016, 42(4): 566-579. doi: 10.16383/j.aas.2016.c150255
Citation: YU Bin-Feng, JI Hai-Bo. Sparse Bayesian Mixture of Experts and Its Application to Spectral Multivariate Calibration. ACTA AUTOMATICA SINICA, 2016, 42(4): 566-579. doi: 10.16383/j.aas.2016.c150255

稀疏贝叶斯混合专家模型及其在光谱数据标定中的应用

doi: 10.16383/j.aas.2016.c150255
基金项目: 

国家高技术研究发展计划(863计划) AA2100100021

详细信息
    作者简介:

    季海波, 中国科学技术大学自动化系教授.1984年获得浙江大学力学与机械工程系学士学位, 1990年获得北京大学力学与工程科学系理学博士学位. 主要研究方向为非线性及自适应控制.E-mail:jihb@ustc.edu.cn

    通讯作者:

    俞斌峰, 中国科学技术大学自动化系博士研究生.2010年获得中国科学技术大学自动化系学士学位. 主要研究方向为机器学习和光谱分析. 本文通信作者.E-mail:ybfeng@mail.ustc.edu.cn

Sparse Bayesian Mixture of Experts and Its Application to Spectral Multivariate Calibration

Funds: 

National High Technology Research and Devel-opment Program of China (863 Program) AA2100100021

More Information
    Author Bio:

    Professor in the Depart- ment of Automation, University of Science and Technology of China. He received his bachelor degree and Ph. D. de- gree in mechanical engineering from Zhejiang University and Beijing University, in 1984 and 1990, respectively. His research interest covers nonlinear control and adaptive con- trol.

    Corresponding author: YU Bin-Feng Ph. D. candidate in the Department of Automation, Uni- versity of Science and Technology of China. He received his bachelor degree from University of Science and Tech- nology of China in 2010. His research interest covers ma- chine learning and spectral analysis. Corresponding author of this paper.
  • 摘要: 在光谱数据的多元校正中, 光谱数据通常是在多种不同的环境条件下收集的. 为了建模来源于不同环境中的高维光谱数据, 本文提出了一种新的稀疏贝叶斯混合专家模型, 并将其用来选择多元校正模型的稀疏特征. 混合专家模型能够把训练数据划分到不同的子类, 之后使用不同的预测模型来分别对划分后的数据进行预测, 因此这种方法适合于建模来自于多种环境下的光谱数据. 本文提出的稀疏的混合专家模型利用稀疏贝叶斯的方法来进行特征选择, 不依赖于事先指定的参数; 同时利用probit模型作为门函数以得到解析的后验分布, 避免了在门函数分类模型中进行特征提取时需要的近似. 本文提出的模型与其他几种常用的回归模型在人工数据集和几个公开的光谱数据集上进行了比较, 比较结果显示本文提出的模型对多个来源的光谱数据进行浓度预测时精度比传统的回归方法有一定的提高.
  • 图  1  SME的概率图模型

    Fig.  1  The probabilistic graph of the SME model

    图  2  不同专家数时的似然函数下界

    Fig.  2  Plot of the lower bound L(q) versus the number of experts

    图  3  专家模型在不同维度上的精度矩阵A 的后验均值

    Fig.  3  The means of the coe±cients of expert models

    图  4  门函数在不同维度上的精度矩阵C 的后验均值

    Fig.  4  The means of the coe±cients of gate function

    图  5  根据玉米数据集的全部样本训练的三个专家的SME 模型的专家模型回归系数的均值

    Fig.  5  The means of the coe±cients of the three expert models of SME trained with the corn data set

    表  1  在人工数据集上的预测结果

    Table  1  The prediction results in the arti-cial data set

    Method RMSECV
    PLS 5.1617 ± 0.7679
    SVR 4.9164 ± 0.5646
    LASSO 5.2411 ± 0.4112
    Ridge 5.0103 ± 0.5044
    ME 10.236 ± 1.5720
    SME 1.5130 ± 0.3117
    下载: 导出CSV

    表  2  玉米光谱数据集的预测结果

    Table  2  The prediction results in corn data set

    Method RMSECV
    PLS0.1480±0.0093
    SVR0.1504±0.0084
    LASSO0.1510±0.0114
    Ridge0.1511±0.0083
    Bagging-ridge0.1239±0.0113
    SME0.1124±0.0034
    Multi-task0.1145±0.0094
    下载: 导出CSV

    表  3  温度数据集的预测结果

    Table  3  The prediction results in temperature data set

    Method RMSECV
    PLS0.0148±0.0026
    SVR0.0180±0.0019
    LASSO0.0208±0.0031
    Ridge0.0345±0.0013
    Bagging-ridge0.0143±0.0018
    SME0.0106±0.0008
    Multi-task0.0225±0.0032
    下载: 导出CSV

    表  4  药片光谱数据集的预测结果

    Table  4  The prediction results in pharmaceutical data set

    Method RMSECV
    PLS0.0148±0.0026
    SVR0.0180±0.0019
    LASSO0.0208±0.0031
    Ridge0.0345±0.0013
    Bagging-ridge0.0143±0.0018
    SME0.0106±0.0008
    Multi-task0.0225±0.0032
    下载: 导出CSV
  • [1] Jacobs R A, Jordan M I, Nowlan S J, Hinton G E. Adaptive mixtures of local experts. Neural Computation, 1991, 3(1): 79-87 doi: 10.1162/neco.1991.3.1.79
    [2] Bishop C M. Pattern Recognition and Machine Learning. New York: Springer, 2006. http://www.oalib.com/references/17189298
    [3] Yuksel S E, Wilson J N, Gader P D. Twenty years of mixture of experts. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(8): 1177-1193 doi: 10.1109/TNNLS.2012.2200299
    [4] Jordan M I, Jacobs R A. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 1994, 6(2): 181-214 doi: 10.1162/neco.1994.6.2.181
    [5] Bo L F, Sminchisescu C, Kanaujia A, Metaxas D. Fast algorithms for large scale conditional 3D prediction. In: Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK: IEEE, 2008. 1-8 https://www.computer.org/csdl/proceedings/cvpr/2008/2242/00/index.html
    [6] Rasmussen C E, Ghahramani Z. Infinite mixtures of Gaussian process experts. In: Proceedings of the 2002 Advances in Neural Information Processing Systems. Cambridge MA: MIT Press, 2002. 881-888
    [7] Meeds E, Osindero S. An alternative infinite mixture of Gaussian process experts. In: Proceedings of the 2006 Advances in Neural Information Processing Systems. Cambridge MA: MIT Press, 2006. 883-890
    [8] Peralta B, Soto A. Embedded local feature selection within mixture of experts. Information Sciences, 2014, 269: 176-187 doi: 10.1016/j.ins.2014.01.008
    [9] Pan W, Shen X T. Penalized model-based clustering with application to variable selection. The Journal of Machine Learning Research, 2007, 8: 1145-1164 http://cn.bing.com/academic/profile?id=2108435369&encoded=0&v=paper_preview&mkt=zh-cn
    [10] Khalili A. New estimation and feature selection methods in mixture-of-experts models. Canadian Journal of Statistics, 2010, 38(4): 519-539 doi: 10.1002/cjs.10083
    [11] Tipping M E. Sparse Bayesian learning and the relevance vector machine. The Journal of Machine Learning Research, 2001, 1: 211-244 http://cn.bing.com/academic/profile?id=1648445109&encoded=0&v=paper_preview&mkt=zh-cn
    [12] Ding Y F, Harrison R F. A sparse multinomial probit model for classification. Pattern Analysis and Applications, 2011, 14(1): 47-55 doi: 10.1007/s10044-010-0177-7
    [13] 徐丹蕾, 杜兰, 刘宏伟, 洪灵, 李彦兵. 一种基于变分相关向量机的特征选择和分类结合方法. 自动化学报, 2011, 37(8): 932-943

    Xu Dan-Lei, Du Lan, Liu Hong-Wei, Hong Ling, Li Yan-Bing. Joint feature selection and classification design based on variational relevance vector machine. Acta Automatica Sinica, 2011, 37(8): 932-943
    [14] Bishop C M, Svensen M. Bayesian hierarchical mixtures of experts. In: Proceedings of the 19th Conference on Uncertainty in Artificial Intelligence. Acapulco, Mexico: Morgan Kaufmann Publishers Inc., 2003. 57-64
    [15] Wülfert F, Kok W T, Smilde A K. Influence of temperature on vibrational spectra and consequences for the predictive ability of multivariate models. Analytical Chemistry, 1998, 70(9): 1761-1767 doi: 10.1021/ac9709920
    [16] Feudale R N, Woody N A, Tan H W, Myles A J, Brown S D, Ferré J. Transfer of multivariate calibration models: a review. Chemometrics and Intelligent Laboratory Systems, 2002, 64(2): 181-192 doi: 10.1016/S0169-7439(02)00085-0
    [17] Thissen U, Üstün B, Melssen W J, Buydens L M C. Multivariate calibration with least-squares support vector machines. Analytical Chemistry, 2004, 76(11): 3099-3105 doi: 10.1021/ac035522m
    [18] Thissen U, Pepers M, Üstün B, Melssen W J, Buydens L M C. Comparing support vector machines to PLS for spectral regression applications. Chemometrics and Intelligent Laboratory Systems, 2004, 73(2): 169-179 doi: 10.1016/j.chemolab.2004.01.002
    [19] Hernández N, Talavera I, Biscay R J, Porro D, Ferreira M M C. Support vector regression for functional data in multivariate calibration problems. Analytica Chimica Acta, 2009, 642(1-2): 110-116 doi: 10.1016/j.aca.2008.10.063
    [20] Barman I, Kong C R, Dingari N C, Dasari R R, Feld M S. Development of robust calibration models using support vector machines for spectroscopic monitoring of blood glucose. Analytical Chemistry, 2010, 82(23): 9719-9726 doi: 10.1021/ac101754n
    [21] Hernández N, Talavera I, Dago A, Biscay R J, Ferreira M M C, Porro D. Relevance vector machines for multivariate calibration purposes. Journal of Chemometrics, 2008, 22(11-12): 686-694 doi: 10.1002/cem.v22:11/12
    [22] Pan S J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359 doi: 10.1109/TKDE.2009.191
    [23] Chen J H, Tang L, Liu J, Ye J P. A convex formulation for learning a shared predictive structure from multiple tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(5): 1025-1038 doi: 10.1109/TPAMI.2012.189
    [24] Ando R K, Zhang T. A framework for learning predictive structures from multiple tasks and unlabeled data. The Journal of Machine Learning Research, 2005, 6: 1817-1853 http://cn.bing.com/academic/profile?id=2130903752&encoded=0&v=paper_preview&mkt=zh-cn
    [25] Romera-Paredes B, Argyriou A, Bianchi-Berthouze N, Pontil M. Exploiting unrelated tasks in multi-task learning. In: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics. La Palma, Canary Islands, 2012. 951-959
    [26] Caruana R. Multitask learning. Machine Learning, 1997, 28(1): 41-75 doi: 10.1023/A:1007379606734
    [27] Argyriou A, Evgeniou T, Pontil M. Convex multi-task feature learning. Machine Learning, 2008, 73(3): 243-272 doi: 10.1007/s10994-007-5040-8
    [28] Zhang W L, Li R J, Zeng T, Sun Q, Kumar S, Ye J P, Ji S W. Deep model based transfer and multi-task learning for biological image analysis. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2015. 1475-1484
    [29] Liu A A, Xu N, Su Y T, Hong L, Hao T, Yang Z X. Single/multi-view human action recognition via regularized multi-task learning. Neurocomputing, 2015, 151: 544-553 doi: 10.1016/j.neucom.2014.04.090
    [30] Archambeau C, Guo S B, Zoeter O. Sparse Bayesian multi-task learning. In: Proceedings of the 2011 Advances in Neural Information Processing Systems. Cambridge MA: MIT Press, 2011. 1755-1763
    [31] Ueda N, Nakano R. Deterministic annealing EM algorithm. Neural Networks, 1998, 11(2): 271-282 doi: 10.1016/S0893-6080(97)00133-0
    [32] Katahira K, Watanabe K, Okada M. Deterministic annealing variant of variational Bayes method. Journal of Physics: Conference Series, 2008, 95(1): 012015 http://cn.bing.com/academic/profile?id=2108849983&encoded=0&v=paper_preview&mkt=zh-cn
    [33] Lin Z Z, Xu B, Li Y, Shi X Y, Qiao Y J. Application of orthogonal space regression to calibration transfer without standards. Journal of Chemometrics, 2013, 27(11): 406-413 doi: 10.1002/cem.2536
    [34] Jun L, Ji S W, Ye J P. Multi-task feature learning via efficient L2, 1-norm minimization. In: Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence. Montreal, Canada, 2009. 339-348 http://cn.bing.com/academic/profile?id=1871180460&encoded=0&v=paper_preview&mkt=zh-cn
  • 加载中
图(5) / 表(4)
计量
  • 文章访问数:  2321
  • HTML全文浏览量:  339
  • PDF下载量:  814
  • 被引次数: 0
出版历程
  • 收稿日期:  2015-04-29
  • 录用日期:  2015-08-31
  • 刊出日期:  2016-04-01

目录

    /

    返回文章
    返回