基于迁移学习的细粒度实体分类方法的研究

冯建周; 马祥聪

doi:10.16383/j.aas.c190041

基于迁移学习的细粒度实体分类方法的研究

doi: 10.16383/j.aas.c190041

冯建周^{1, 2,},
马祥聪^{1, 2,}

1.
燕山大学信息科学与工程学院秦皇岛 066004
2.
燕山大学河北省软件工程重点实验室秦皇岛 066004

基金项目: 国家自然科学基金(61602401), 河北省高等学校科学技术研究青年基金(QN2018074), 河北省自然科学基金(F2019203157)资助

详细信息

作者简介:
冯建周：燕山大学信息科学与工程学院副教授. 主要研究方向为知识图谱, 语义web. 本文通信作者. E-mail: fjzwxh@ysu.edu.cn

马祥聪：燕山大学信息科学与工程学院硕士研究生. 主要研究方向为知识图谱. E-mail: maxiangcong@126.com

计量
- 文章访问数: 1616
- HTML全文浏览量: 795
- PDF下载量: 388
- 被引次数: 0
出版历程
- 收稿日期: 2019-01-16
- 录用日期: 2019-08-08
- 网络出版日期: 2020-08-26
- 刊出日期: 2020-08-26

Fine-grained Entity Type Classification Based on Transfer Learning

FENG Jian-Zhou^{1, 2
,},
MA Xiang-Cong^{1, 2
,}

1.
School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004
2.
Software Engineering Key Laboratory of Hebei Province, Yanshan University, Qinhuangdao 066004

Funds: Supported by National Natural Science Foundation of China (61602401), Youth Fund for Scientific Technological in Colleges and Universities of Hebei Province (QN2018074), and Nature Scientist Fund of Hebei Province (F2019203157)

摘要

摘要: 细粒度实体分类(Fine-grained entity type classification, FETC)旨在将文本中出现的实体映射到层次化的细分实体类别中. 近年来, 采用深度神经网络实现实体分类取得了很大进展. 但是, 训练一个具备精准识别度的神经网络模型需要足够数量的标注数据, 而细粒度实体分类的标注语料非常稀少, 如何在没有标注语料的领域进行实体分类成为难题. 针对缺少标注语料的实体分类任务, 本文提出了一种基于迁移学习的细粒度实体分类方法, 首先通过构建一个映射关系模型挖掘有标注语料的实体类别与无标注语料实体类别间的语义关系, 对无标注语料的每个实体类别, 构建其对应的有标注语料的类别映射集合. 然后, 构建双向长短期记忆(Bidirectional long short term memory, BiLSTM)模型, 将代表映射类别集的句子向量组合作为模型的输入用来训练无标注实体类别. 基于映射类别集中不同类别与对应的无标注类别的语义距离构建注意力机制, 从而实现实体分类器以识别未知实体分类. 实验证明, 我们的方法取得了较好的效果, 达到了在无任何标注语料前提下识别未知命名实体分类的目的.
- 细粒度实体分类 /
- 迁移学习 /
- 双向长短期记忆模型 /
- 注意力机制
Abstract: The aim of fine-grained entity type classification (FETC) is that mapping the entity appearing in the text into hierarchical fine-grained entity type. In recent years, deep neural network is used to entity classification and has made great progress. However, training a neural network model with precise recognition requires a great quantity labeled data. The labeled dataset of fine-grained entity classification is so rare that hard to classify unlabeled entity. This paper proposes a fine-grained entity classification method based on transfer learning for the task of entity classification with lack labeled dataset. Firstly, we construct a mapping relation model to mining the semantic relationship between labeled entity type and unlabeled entity type, we construct a corresponding labeled entity type mapping set for each unlabeled entity type. Then, we construct a bidirectional long short term memory (BiLSTM) model, the sentence vector combination representing the mapping type set is used as the input of the model to train the unlabeled entity type. Lastly, the attention mechanism is constructed based on the semantic distance between different types in the mapping type set and corresponding unlabeled type, so as to realize entity classifier to recognize the classification of unknown entities. The experiment shows that our method have achieved good results and achieved the purpose of identifying unknown named entity classification with unlabeled dataset.
- Fine-grained entity type classification (FETC) /
- transfer learning /
- bidirectional long short term memory model (BiLSTM) /
- attention mechanism

HTML全文

图 1 基于实体类别关系映射与注意力机制的迁移模型结构

Fig. 1 The transferring model based on entity type relationship mapping and attention mechanism

下载: 全尺寸图片幻灯片

图 2 实体类别映射关系图

Fig. 2 Entity type mapping relation chart

下载: 全尺寸图片幻灯片

图 3 迁移规模与实体类别映射集规模对比图

Fig. 3 Transfer scale and entity type mapping set scale contrast chart

下载: 全尺寸图片幻灯片

表 1 混淆矩阵

Table 1 Confusion matrix

		预测情况
		正例	反例
真实情况	正例	TP (真正例)	FN (假反例)
真实情况	反例	FP (假正例)	TN (真反例)

下载: 导出CSV

表 2 超参数设置

Table 2 Hyper-parametric settings table

$L_r$	$D_w$	$D_p$	$B$	$P_i$	$P_o$	$\lambda$
0.0002	180	85	256	0.7	0.9	0.0

下载: 导出CSV

表 3 数据集规模表

Table 3 Datasets size table

	有标注数据集 (源领域)	无标注数据集 (目标领域)
类别数量	50	30
mention 数量	896 914	229 685
Token 数量	15 284 525	3 929 738

下载: 导出CSV

表 4 无标注领域不同模型对比实验

Table 4 Comparative experiment of different models inunlabeled field

模型	Acc	Macro F1	Micro F1
TransNER	0.051	0.035	0.041
FNET	0.026	0.027	0.028
TLERMAM	0.369	0.290	0.355

下载: 导出CSV

表 5 稀疏标注领域不同模型对比实验

Table 5 Comparison experiment of different models in the field of sparse annotation

模型	Acc	Macro F1	Micro F1
TransNER	0.500	0.337	0.534
FNET	0.523	0.329	0.447
TLERMAM	0.805	0.487	0.805

下载: 导出CSV

表 6 军事领域和文化领域的实体类别集

Table 6 Entity type set of military field andculture field

领域	实体类别
军事	terrorist_organization, weapon, attack, soldier, military, terrorist_attack, power_station, terrorist, military_conflict
文化	film, theater, artist, play, ethnicity, author, written_work, language, director, music, musician, newspaper, election, protest, broadcast_network, broadacast_program, tv_channel, religion, educational_institution, library, educational_department, educational_degree, actor, news_agency, instrument

下载: 导出CSV

表 7 军事领域和文化领域的数据集规模表

Table 7 Dataset size of military field and culture field

	有标注数据集 (文化领域)	无标注数据集 (军事领域)
类别数量	25	9
mention 数量	226 734	126 036
Token 数量	3 927 700	2 104 890

下载: 导出CSV

表 8 无标注语料的军事领域实体识别效果比较

Table 8 Comparison of entity recognition in unlabeledmilitary field

模型	Acc	Macro F1	Micro F1
TransNER	0.040	0.023	0.012
FNET	0.013	0.014	0.029
TLERMAM	0.257	0.339	0.339

下载: 导出CSV

表 9 稀疏标注语料的军事领域识别对比

Table 9 Comparison of entity recognition in military field with sparse annotated corpus

模型	Acc	Macro F1	Micro F1
TransNER	0.338	0.204	0.285
FNET	0.460	0.424	0.537
TLERMAM	0.572	0.504	0.559

下载: 导出CSV

参考文献(21)

[1]	MUC-6. The sixth in a Series of Message Understanding Conferences [Online], available: https://cs.nyu.edu/cs/faculty/grishman/muc6.html, 1995.
[2]	Grishman R. The NYU system for MUC-6 or where's the syntax? In: Proceedings of the 6th conference on Message understanding. Maryland, USA: ACL, 1995. 167−175
[3]	Zhou G D, Su J. Named entity recognition using an hmmbased chunk tagger. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia, USA: ACL, 2002. 473−480
[4]	Borthwick A, Grishman R. A maximum entropy approach to named entity recognition [Ph. D. dissertation], New York University, 1999.
[5]	Lafferty J D, Mccallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the 18th Internaional Conference on Machine Learning. Williamstown, MA, USA: ICML, 2001. 282−289
[6]	Athavale V, Bharadwaj S, Pamecha M, et al. Towards deep learning in hindi NER: an approach to tackle the labelled data scarcity. arXiv preprint arXiv: 1610.09756, 2016.
[7]	Huang Z, Xu W, Yu K. Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv: 1508.01991, 2015.
[8]	Ma X, Hovy E. End-to-end sequence labeling via bidirectional lstm-cnns-crf. arXiv preprint arXiv: 1603.01354, 2016.
[9]	Bharadwaj A, Mortensen D, Dyer C, et al. Phonologically aware neural model for named entity recognition in low resource transfer settings. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Texas, USA: EMNLP, 2016. 1462−1472
[10]	Putthividhya D P, Hu J. Bootstrapped named entity recognition for product attribute extraction. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing. Edinburgh, UK: EMNLP, 2011. 1557−1567
[11]	Schmitz M, Bart R, Soderland S, et al. Open language learning for information extraction. In: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Jeju Island, Korea: ACL, 2012. 523−534
[12]	Manning C, Surdeanu M, Bauer J, et al. The Stanford CoreNLP natural language processing toolkit. In: Proceedings of the 52nd annual meeting of the association for computational linguistics. Maryland, USA: ACL, 2014. 55−60
[13]	Ling X, Weld D S. Fine-Grained entity recognition. In: Proceedings of the 26th AAAI Conference on Artificial Intelligence. Toronto, Canada: AAAI, 2012. 94−100
[14]	Gillick D, Lazic N, Ganchev K, et al. Context-dependent fine-grained entity type tagging. arXiv preprint arXiv: 1412.1820, 2014.
[15]	Shimaoka S, Stenetorp P, Inui K, et al. An attentive neural architecture for fine-grained entity type classification. In: Proceedings of the 5th Workshop on Automated Knowledge Base Construction. San Diego, USA: AKBC, 2016. 69−74
[16]	Yang Z, Salakhutdinov R, Cohen W W. Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint arXiv: 1703.06345, 2017.
[17]	Lee J Y, Dernoncourt F, Szolovits P. Transfer learning for named-entity recognition with neural networks. arXiv preprint arXiv: 1705.06273, 2017.
[18]	Abhishek A, Anand A, Awekar A. Fine-grained entity type classification by jointly learning representations and label embeddings. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Valencia, Spain: ACL, 2017. 797−807
[19]	Cosine_similarity. Cosine_similarity [Online]. available: https://en.wikipedia.org/wiki/Cosine_similarity, 2018.
[20]	Zhou P, Shi W, Tian J, et al. Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: ACL, 2016. 207−212
[21]	Pennington J, Socher R, Manning C. Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing. Doha, Qatar: EMNLP, 2014. 1532−1543