Conditional Value-based Co-training
-
摘要: Co-training是一种主流的半监督学习算法. 该算法中两视图下的分类器通过迭代的方式, 互为对方从无标记样本集中挑选新增样本, 以更新对方训练集. Co-training以分类器的后验概率输出作为新增样本的挑选策略, 该策略忽略了样本对于当前分类器的价值. 针对该问题, 本文提出一种改进的Co-training式算法—CVCOT (Conditional value-based co-training), 即采用基于样本条件价值的挑选策略来优化Co-training. 通过定义无标记样本的条件价值, 各视图下的分类器以样本条件价值为依据来挑选新增样本, 以此更新训练集. 该策略既可保证新增样本的标记可靠性, 又能优先将价值较高的富信息样本补充到训练集中, 可以有效地优化分类器. 在UCI数据集和网页分类应用上的实验结果表明: CVCOT具有较好的分类性能和学习效率.
-
关键词:
- 机器学习 /
- 半监督学习 /
- Co-training /
- 富信息样本 /
- 条件价值
Abstract: Co-training is one of the major semi-supervised learning methods, which iteratively trains two classifiers under two different views, and uses the predictions of either classifier on the unlabeled examples to augment the training set of the other. In each round of co-training, newly added examples are selected according to the classifier's posteriori probability output, which neglects examples' value with respect to the current classifier. This paper proposes an improved co-training style algorithm, termed as CVCOT (conditional value-based co-training), which employs a conditional value-based strategy for selecting candidate training examples. Specifically, the conditional value of unlabeled examples in the co-training process is defined and computed, then it is utilized by either classifier under different views for augmenting the training set of the other. The new strategy can not only guarantee the reliability of the pseudo-labels, but also tends to add more informative examples with higher values to the training sets. Therefore, the classifier under either view will get refined. Experiments on UCI data sets and application to the web page classification task indicate that the CVCOT achieves better classification performance and learning efficiency.-
Key words:
- Machine learning /
- semi-supervised learning /
- co-training /
- informative example /
- conditional value
-
[1] Chapelle O, Schölkopf B, Zien A. Semi-Supervised Learning. Cambridge, MA: MIT Press, 2006 [2] Blum A, Mitchell T. Combining labeled and unlabeled data with co-training. In: Proceedings of the 11th Annual Conference on Computational Learning Theory. Wisconsin, MI: ACM, 1998. 92-100 [3] Zhu X J. Semi-supervised Learning Literature Survey, Computer Science Technical Report 1530. University of Wisconsin Madison, USA, 2008 [4] Pierce D, Cardie C. Limitations of co-training for natural language learning from large datasets. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing. Pittsburgh, PA, 2001. 1-9 [5] Steedman M, Osborne M, Sarkar A, Clark S, Hwa R, Hockenmaier J, Ruhlen P, Baker S, Crim J. Bootstrapping statistical parsers from small datasets. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Budapest, Hungary: Association for Computational Linguistics Stroudsburg, 2003. 331-338 [6] Li M, Li H, Zhou Z H. Semi-supervised document retrieval. Information Processing and Management, 2009, 45(3): 341-355 [7] Li M, Zhou Z H. Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 2007, 37(6): 1088-1098 [8] Mavroeidis D, Chaidos K, Pirillos S, Vazirgiannis M. Using tri-training and support vector machines for addressing the ECML-PKDD 2006 discovery challenge. In: Proceedings of the 2006 ECML-PKDD Discovery Challenge Workshop. Berlin, Germany, 2006. 39-47 [9] Settles B. Active Learning Literature Survey, Computer Science Technical Report 1648, University of Wisconsin-Madison, USA, 2009 [10] Singh A, Nowak R D, Zhu X J. Unlabeled data: now it helps, now it doesn't. Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2008. 1513-1520 [11] Dasgupta S, Littman M L, McAllester D. PAC generalization bounds for co-training. Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2001. 375-382 [12] Balcan M, Blum A, Yang K. Co-training and expansion: towards bridging theory and practice. Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2005. 89-96 [13] Wang W, Zhou Z H. A new analysis of co-training. In: Proceedings of the 27th International Conference on Machine Learning. Haifa, Israel, 2010. 1135-1142 [14] Du J, Ling C X, Zhou Z H. When does cotraining work in real data? IEEE Transactions on Knowledge and Data Engineering, 2011, 23(5): 788-799 [15] Nigam K, Ghani R. Analyzing the effectiveness and applicability of co-training. In: Proceedings of the 9th ACM International Conference on Information and Knowledge Management. McLean, VA: ACM, 2000. 86-93 [16] Zhou Z H, Li M. Semi-supervised learning by disagreement. Knowledge and Information Systems, 2010, 24(3): 415-439 [17] Goldman S A, Zhou Y. Enhancing supervised learning with unlabeled data. In: Proceedings of the 17th International Conference on Machine Learning. San Francisco, CA: Morgan Kaufmann Publishers Inc, 2000. 327-334 [18] Zhou Z H, Li M. Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541 [19] Li M, Zhou Z H. SETRED: self-training with editing. In: Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining. Hanoi, Vietnam: Springer-Verlag, 2005. 611-621 [20] Deng Cao, Guo Mao-Zu. ADE-Tri-training: tri-training with adaptive data editing. Chinese Journal of Computers, 2007, 30(8): 1213-1226 (邓超, 郭茂祖. 基于自适应数据剪辑策略的Tri-training算法. 计算机学报, 2007, 30(8): 1213-1226) [21] Zhang M L, Zhou Z H. CoTrade: confident co-training with data editing. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, 2011, 41(6): 1612-1626 [22] Chen Rong, Cao Yong-Feng, Sun Hong. Multi-class image classification with active learning and semi-supervised learning. Acta Automatica Sinica, 2011, 37(8): 954-962 (陈荣, 曹永锋, 孙洪. 基于主动学习和半监督学习的多类图像分类. 自动化学报, 2011, 37(8): 954-962) [23] MaCallum A, Nigam K. Employing EM in pool-based active learning for text classification. In: Proceedings of the 15th International Conference on Machine Learning. San Francisco: Morgan Kaufmann, 1998. 350-358 [24] Muslea I, Minton S, Knoblock C A. Active+Semi-supervised learning=Robust multi-view learning. In: Proceedings of the 19th International Conference on Machine Learning. Sydney, Australia: Morgan Kaufmann Publishers Inc, 2002. 435-442 [25] Muslea I, Minton S, Knoblock C A. Active learning with multiple views. Journal of Artificial Intelligence Research, 2006, 27(1): 203-233 [26] Zhou Z H, Chen K J, Dai H B. Enhancing relevance feedback in image retrieval using unlabeled data. ACM Transactions on Information Systems, 2006, 24(2): 219-244 [27] Li M, Zhang H Y, Wu R X, Zhou Z H. Sample-based software defect prediction with active and semi-supervised learning. Automated Software Engineering, 2012, 19(2): 201-230 [28] Yarowsky D. Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, 1995. 189-196 [29] Lewis D D, Gale W A. A sequential algorithm for training text classifiers. In: Proceedings of the 17th annual international ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY: Springer-Verlag, 1994. 3-12
点击查看大图
计量
- 文章访问数: 1997
- HTML全文浏览量: 71
- PDF下载量: 1140
- 被引次数: 0