Rule Extraction Approach to Text Categorization Based on Multi-population Collaborative Optimization
-
摘要: 针对文本分类中的规则抽取问题, 提出一种基于多种群协同优化的文本分类规则抽取方法. 该方法利用信息熵生成初始种群, 采用多种群协同优化方法演化当前种群. 多种群协同优化方法通过种群之间的相互竞争和良种共享机制提高优化方法的效率. 实验结果表明, 本文提出的文本分类规则抽取方法所抽取规则的数量少, 准确率高, 平均长度短; 同时, 本文方法所用的计算时间少, 抽取分类规则的速度快, 适用于大规模数据集.Abstract: For the problem of rule extraction in text categorization, a novel rule extraction approach to text categorization based on multi-population collaborative optimization was proposed. Information entropy was applied to generation of initial populations and the multi-population collaborative optimization method was employed to evolve the current population in this proposed approach. The optimization efficiency of this approach was improved by the mutual competition and excellent individuals sharing mechanisms among populations. Experimental results have shown that the number of the rules extracted by this approach is small, and that the accuracy of these rules is high and the average length of them is short. Furthermore, the time of this approach is short and the speed of rule extraction through this approach is high. Therefore, this approach is suitable for large-scale data sets.
计量
- 文章访问数: 1810
- HTML全文浏览量: 43
- PDF下载量: 1275
- 被引次数: 0