基于无指导机器学习的全文词义自动标注方法

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于无指导机器学习的全文词义自动标注方法

卢志茂, 刘挺, 李生

文章导航 > 自动化学报 > 2006 > 32(2): 228-236

卢志茂, 刘挺, 李生. 基于无指导机器学习的全文词义自动标注方法. 自动化学报, 2006, 32(2): 228-236.

引用本文:

卢志茂, 刘挺, 李生. 基于无指导机器学习的全文词义自动标注方法. 自动化学报, 2006, 32(2): 228-236.

LU Zhi-Mao, LIU Ting, LI Sheng. Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm. ACTA AUTOMATICA SINICA, 2006, 32(2): 228-236.

Citation:

LU Zhi-Mao, LIU Ting, LI Sheng. Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm. ACTA AUTOMATICA SINICA, 2006, 32(2): 228-236.

卢志茂, 刘挺, 李生. 基于无指导机器学习的全文词义自动标注方法. 自动化学报, 2006, 32(2): 228-236.

引用本文:

卢志茂, 刘挺, 李生. 基于无指导机器学习的全文词义自动标注方法. 自动化学报, 2006, 32(2): 228-236.

LU Zhi-Mao, LIU Ting, LI Sheng. Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm. ACTA AUTOMATICA SINICA, 2006, 32(2): 228-236.

Citation:

LU Zhi-Mao, LIU Ting, LI Sheng. Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm. ACTA AUTOMATICA SINICA, 2006, 32(2): 228-236.

基于无指导机器学习的全文词义自动标注方法

1.
哈尔滨工程大学计算机科学与技术学院，哈尔滨，150001

通讯作者:
卢志茂

计量
- 文章访问数: 4315
- HTML全文浏览量: 53
- PDF下载量: 1820
- 被引次数: 0
出版历程
- 收稿日期: 2004-05-24
- 修回日期: 2005-11-30
- 刊出日期: 2006-03-20

Full-words Automatic Word Sense Tagging Based on Unsupervised Learning Algorithm

1.
Computer Science & Technology School, Harbin Engineering University, Harbin 150001

More Information

Corresponding author: LU Zhi-Mao

摘要: 为实现汉语全文词义自动标注，本文采用了一种新的基于无指导机器学习策略的词义标注方法。实验中建立了四个词义排歧模型，并对其测试结果进行了比较.其中实验效果最优的词义排歧模型融合了两种无指导的机器学习策略，并借助依存文法分析手段对上下文特征词进行选择.最终确定的词义标注方法可以使用大规模语料对模型进行训练，较好的解决了数据稀疏问题，并且该方法具有标注正确率高、扩展性能好等优点，适合大规模文本的词义标注工作.
- 词义标注 /
- 无指导学习算法 /
- 单纯贝叶斯模型 /
- 依存文法
Abstract: For the purpose of implementing automatic Chinese word sense tagging, this paper presents a new method for word sense disambiguation based on unsupervised machine learning strategies. Four models of word sense disambiguation are built and compared. The model with two unsupervised machine learning strategies and selecting contextual features using dependence grammar obtains the best performance. And it can be trained with large-scale corpus to deal with the problem of data sparseness. In addition, it has such characteristics as high accuracy, high speed, easy extension and so on. Thus this technique is competent for word sense tagging on large-scale real-world text.
- Sense Tagging /
- unsupervised learning algorithm /
- naive Bayesian model /
- Dependency grammar

参考文献(0)

资源附件(0)

计量

文章访问数: 4315
HTML全文浏览量: 53
PDF下载量: 1820
被引次数: 0

/

下载: 全尺寸图片幻灯片

分享

用微信扫码二维码

分享至好友和朋友圈

返回

版权所有 © 《自动化学报》编辑部京ICP备14019135号-6

地址：北京中关村东路95号邮政编码：100190E-mail：aas_editor@ia.ac.cn

电话：010-82544677 (日常咨询和稿件处理)，010-82544653(费用管理、寄刊)

本系统由北京仁和汇智信息技术有限公司开发技术支持： info@rhhz.net