Kernel Support Vector Machine for Domain Adaptation
-
摘要: 领域适应学习是一种新颖的解决先验信息缺少的模式分类问题的有效方法, 最大化地缩小领域间样本分布差是领域适应学习成功的关键因素之一,而仅考虑领域间分布均值差最小化, 使得在具体领域适应学习问题上存在一定的局限性.对此,在某个再生核Hilbert空间, 在充分考虑领域间分布的均值差和散度差最小化的基础上,基于结构风险最小化模型, 提出一种领域适应核支持向量学习机(Kernel support vector machine for domain adaptation, DAKSVM)及其最小平方范式,人造和实际数据集实验结果显示,所提方法具有优化或可比较的模式分类性能.Abstract: Domain adaptation learning is a novel effective technique to address pattern classification, in which the prior information for training a learning model is unavailable or insufficient. To minimize the distribution discrepancy between the source domain and target domain is one of the key factors. However, domain adaptation learning may not work well when only considering to minimize the distribution mean discrepancy between source domain and target domain. In the paper, we design a novel domain adaptation learning method based on structure risk minimization model, called DAKSVM (kernel support vector machine for domain adaptation) with respect to support vector machine (SVM) and least-square DAKSVM (LSDAKSVM) with respect to least-square SVM (LS-SVM), respectively to effectively minimize both the distribution mean discrepancy and the distribution scatter discrepancy between source domain and target domain in some reproduced kernel Hilbert space, which is then used to improve the classification performance. Experimental results on artificial and real world problems show the superior or comparable effectiveness of the proposed approach compared to related approaches.
-
[1] Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2011, 22(2): 199-210[2] Xiang E W, Cao B, Hu D H, Yang Q. Bridging domains using world wide knowledge for transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(6): 770-783[3] Joachims T. Transductive inference for text classification using support vector machines. In: Proceedings of 16th International Conference on Machine Learning (ICML-99). San Francisco, CA: Morgan Kaufmann Publishers, 1999. 200- 209[4] Ozawa S, Roy A, Roussinov D. A multitask learning model for online pattern recognition. IEEE Transactions on Neural Networks, 2009, 20(3): 430-445[5] Pan S J, Yang Q. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 2010, 22(10): 1345-1359[6] Bruzzone L, Marconcini M. Domain adaptation problems: a DASVM classification technique and a circular validation strategy. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 770-787[7] Quanz B, Huan J. Large margin transductive transfer learning. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM). New York, USA: ACM 2009. 1327-1336[8] Ben-David S, Blitzer J, Crammer K, Pereira F. Analysis of representations for domain adaptation. In: Proceedings of the Neural Information Processing Systems (NIPS) 2006. Cambridge, MA: MIT Press, 2007[9] Ling X, Dai W Y, Xue G R, Yang Q, Yu Y. Spectral domain-transfer learning. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2008[10] Dai W Y, Xue G R, Yang Q, Yu Y. Co-clustering based classification for out-of-domain documents. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Jose, California, USA: ACM, 2007. 210-219[11] Blitzer J, McDonald R, Pereira F. Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Sydney, Australia: Association for Computational Linguistics, 2006. 120-128[12] Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boom-boxes and blenders: domain adaptation for sentiment classification. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL'07). Prague, CZ: Association for Computational Linguistics, 2007. 440-447[13] Sriperumbudur B K, Gretton A, Fukumizu K, Schlkopf B, Lanckriet G R G. Hilbert space embeddings and metrics on probability measures. Journal of Machine Learning Research, 2010, 11(3): 1517-1561[14] Gretton A, Fukumizu K, Harchaoui Z, Sriperumbudur B K. A fast, consistent kernel two-sample test. In: Proceedings of Advances in Neural Information Processing Systems 22, the 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009). Red Hook, NY: MIT Press, 2010. 673-681[15] Vapnik V N. Statistical Learning Theory. New York: John Wiley and Sons, 1998[16] Phan X H, Nguyen M L, Horiguchi S. Learning to classify short and sparse text web with hidden topics from large-scale data collections. In: Proceedings of the 17th International Conference on World Wide Web (WWW'08). New York, USA: ACM, 2008. 91-100[17] Belkin M, Niyogi P, Sindhwani V, Bartlett P. Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. Journal of Machine Learning Research, 2006, 7(1): 2399-2434[18] Hofmann T, Schlkopf, Smola A J. Kernel methods in machine learning. Annals of Statistics, 2007, 36(3): 1171-1220[19] Sriperumbadur B K, Fukumizu K, Gretton A, Lanckriet G R G, Schlkopf B. Kernel choice and classifiability for RKHS embeddings of probability distributions. In: Advances in Neural Information Processing Systems 22, the 23rd Annual Conference on Neural Information Processing Systems (NIPS 2009). Red Hook, NY: MIT Press, 2010. 1750-1758[20] Smola A, Gretton A, Song L, Schlkopf B. A Hilbert space embedding for distributions. In: Proceedings of the 18th International Conference on Algorithmic Learning Theory. Sendai, Japan: Springer-Verlag, 2007. 13-31[21] Wu Y C, Liu Y F. Robust truncated hinge loss support vector machines. Journal of the American Statistical Association, 2007, 102(479): 974-983[22] Schlkopf B, Herbrich R, Smola A J. A generalized representer theorem. In: Proceedings of the 14th Annual Conference on Computational Learning Theory and 5th European Conference on Computational Learning Theory (COLT'2001). Amsterdam, UK: Springer Press, 2001. 416- 426[23] Kanamori T, Hido S, Sugiyama M. A least-squares approach to direct importance estimation. Journal of Machine Learning Research, 2009, 10(1): 1391-1445[24] Szedmak S, Shawe-Taylor J. Multiclass Learning at One-class Complexity. Technical Report No: 1508, School of Electronics and Computer Science, Southampton, UK, 2005[25] Blitzer J, Crammer K, Kulesza A, Pereira F, Wortman J. Learning bounds for domain adaptation. In: Proceedings of the Neural Information Processing Systems (NIPS) 2006. Cambridge, MA: MIT Press, 2007[26] Gao J, Fan W, Jiang J, Han J W. Knowledge transfer via multiple model local structure mapping. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2008[27] Gao Jun, Wang Shi-Tong, Deng Zhao-Hong. Global and local preserving based semi-supervised support vector machine. Acta Electronica Sinica, 2010, 38(7): 1626-1633 (皋军, 王士同, 邓赵红. 基于全局和局部保持的半监督支持向量机. 电子学报, 2010, 38(7): 1626-1633)[28] Cai D, He X F, Han J W, Zhang H J. Orthogonal Laplacianfaces for face recognition. IEEE Transactions on Image Processing, 2006, 15(11): 3608-3614[29] Chang C C, Lin C J. LIBSVM: a library for support vector machines. Science, 2001, 2(3): 1-39[30] Beitzel S M, Jensen E C, Frieder O, Lewis D D, Chowdhury A, Kolcz A. Improving automatic query classification via semi-supervised learning. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM'05). Washington DC, USA: IEEE Computer Society, 2005. 42-49
点击查看大图
计量
- 文章访问数: 2382
- HTML全文浏览量: 69
- PDF下载量: 1193
- 被引次数: 0