基于大语言模型的中文实体链接实证研究

徐正斐; 辛欣

doi:10.16383/j.aas.c240069

基于大语言模型的中文实体链接实证研究

doi: 10.16383/j.aas.c240069 cstr: 32138.14.j.aas.c240069

徐正斐^{1, 2,},
辛欣^{1, 2,}

1.
北京理工大学计算机学院北京 100081
2.
北京理工大学北京市海量语言信息处理与云计算应用工程技术研究中心北京 100081

基金项目: 国家自然科学基金(62172044)资助

详细信息

作者简介:
徐正斐：北京理工大学计算机学院硕士研究生. 主要研究方向为知识工程. E-mail: zhengfei@bit.edu.cn

辛欣：北京理工大学计算机学院副教授. 主要研究方向为自然语言处理, 知识工程, 信息检索. 本文通信作者. E-mail: xxin@bit.edu.cn

计量
- 文章访问数: 941
- HTML全文浏览量: 730
- PDF下载量: 227
- 被引次数: 0
出版历程
- 收稿日期: 2024-01-31
- 录用日期: 2024-08-07
- 网络出版日期: 2025-01-14
- 刊出日期: 2025-02-17

An Empirical Study of Chinese Entity Linking Based on Large Language Model

XU Zheng-Fei^{1, 2
,},
XIN Xin^{1, 2
,}

1.
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081
2.
Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, Beijing Institute of Technology, Beijing 100081

Funds: Supported by National Natural Science Foundation of China (62172044)

More Information

Author Bio:
XU Zheng-Fei　Master student at the School of Computer Science and Technology, Beijing Institute of Technology. His main research interest is knowledge engineering

XIN Xin　Associate professor at the School of Computer Science and Technology, Beijing Institute of Technology. His research interest covers natural language processing, knowledge engineering, and information retrieval. Corresponding author of this paper

摘要

摘要: 近年来, 大语言模型(Large language model, LLM)在自然语言处理中取得重大进展. 在模型足够大时, 大语言模型涌现出传统的预训练语言模型(Pre-trained language model, PLM)不具备的推理能力. 为了探究如何将大语言模型的涌现能力应用于中文实体链接任务, 适配了以下四种方法: 知识增强、适配器微调、提示学习和语境学习(In-context learning, ICL). 在Hansel和CLEEK数据集上的实证研究表明, 基于Qwen-7B/ChatGLM3-6B的监督学习方法超过基于小模型的方法, 在Hansel-FS数据集上提升3.9% ~ 11.8%, 在Hansel-ZS数据集上提升0.7% ~ 4.1%, 在CLEEK数据集上提升0.6% ~ 3.7%. 而当模型参数量达到720亿时, Qwen-72B的无监督方法实现与监督微调Qwen-7B相近的结果(−2.4% ~ +1.4%). 此外, 大语言模型Qwen在长尾实体场景下有明显的优势(11.8%), 且随着参数量的增加, 优势会更加明显(13.2%). 对错误案例进行分析(以下简称错误分析)发现, 实体粒度和实体类别相关错误占比较高, 分别为36%和25%. 这表明在实体链接任务中, 准确划分实体边界以及正确判断实体类别是提高系统性能的关键.
- 实体链接 /
- 大语言模型 /
- 知识增强 /
- 适配器微调 /
- 提示学习 /
- 语境学习
Abstract: Large language models (LLMs) have recently made significant advancements in natural language processing. When scaled sufficiently, large language models exhibit reasoning capabilities that traditional pre-trained language models (PLMs) lack. In order to explore how to apply the emergent capabilities of large language models to the Chinese entity linking task, the following four methods are adapted: Knowledge augmentation, adapter fine-tuning, prompt learning, and in-context learning. Empirical studies on the Hansel and CLEEK datasets show that supervised learning methods based on Qwen-7B/ChatGLM3-6B outperform PLM-based methods. It achieves improvements ranging from 3.9% to 11.8% on the Hansel-FS dataset, 0.7% to 4.1% on the Hansel-ZS dataset, and 0.6% to 3.7% on the CLEEK dataset. When scaled to 72 billion parameters, Qwen-72B＇s unsupervised methods yield results comparable to the supervised fine-tuning of Qwen-7B, with a performance range of −2.4% to +1.4%. Furthermore, the large language model Qwen has a clear advantage in the long-tail entity scenario (11.8%), and as the number of parameters increases, the advantage will become more obvious (13.2%). The analysis of the error cases (hereinafter referred to as error analysis) found that the errors related to entity granularity and entity type accounted for a high proportion, 36% and 25% respectively. This shows that in the entity linking task, accurately dividing entity boundaries and correctly judging entity types are the key to improving system performance.
- Entity linking /
- large language model (LLM) /
- knowledge augmentation /
- adapter fine-tuning /
- prompt learning /
- in-context learning (ICL)
注释:

1) 1¹ 交互示例的输出内容来自: https://www.chatglm.cn

2) 2² https://dumps.wikimedia.org/wikidatawiki/

3) 3³ https://github.com/THUDM/ChatGLM3⁴ https://github.com/QwenLM/Qwen⁵ https://github.com/piskvorky/gensim

4) 4⁴ https://github.com/QwenLM/Qwen

5) 5⁵ https://github.com/piskvorky/gensim

6) 6⁶ https://github.com/facebookresearch/GENRE⁷ https://github.com/huggingface/peft⁸ https://huggingface.co/DMetaSoul/sbert-chinese-general-v2

7) 7⁷ https://github.com/huggingface/peft

8) 8⁸ https://huggingface.co/DMetaSoul/sbert-chinese-general-v2

HTML全文

图 1 研究动机

Fig. 1 The research motivation

下载: 全尺寸图片幻灯片

图 2 实体链接系统的整体架构

Fig. 2 The architecture of the entity linking system

下载: 全尺寸图片幻灯片

图 3 实体消歧方法中的提示语示例

Fig. 3 Prompt examples in entity disambiguation

下载: 全尺寸图片幻灯片

图 4 基于大语言模型的四种实体消歧方法

Fig. 4 Four methods for LLM-based entity disambiguation

下载: 全尺寸图片幻灯片

图 5 示例数量和选择策略对语境学习的影响

Fig. 5 The impact of examples number and selection strategies on in-context learning method

下载: 全尺寸图片幻灯片

表 1 Hansel和CLEEK数据集的统计信息

Table 1 Statistics of the Hansel and CLEEK datasets

数据集	# 指称	# 文档	# 实体
数据集	# 指称	# 文档	$ E_{\rm{known}} $	$ E_{\rm{new}} $	总计
Hansel-Train	9.89 M	1.05 M	541 K	—	541 K
Hansel-Dev	9 677	1 000	6 323	—	6 323
Hansel-FS	3 404	3 389	2 720	—	2 720
Hansel-ZS	4 208	4 200	1 054	2 992	4 046
CLEEK	2 412	100	1 100	—	1 100

下载: 导出CSV

表 2 候选实体生成模型在Hansel和CLEEK数据集上的召回率 (%)

Table 2 Recall of candidate entity generation model on Hansel and CLEEK (%)

方法	Hansel-FS			Hansel-ZS			CLEEK
方法	R@1	R@10	R@100	R@1	R@10	R@100	R@1	R@10	R@100
AT	0	61.1	63.0	70.6	78.5	78.8	69.4	77.8	79.1
BM25	13.1	41.9	71.1	69.7	84.1	90.9	34.9	46.8	57.2
DE	46.8	81.1	92.6	78.2	93.2	97.2	58.7	81.3	92.2
注: 加粗字体表示各列最优结果.

下载: 导出CSV

表 3 实体链接基线方法在Hansel和CLEEK数据集上的准确率 (%)

Table 3 Accuracy of entity linking baseline methods on Hansel and CLEEK (%)

数据集	$ {\rm{TyDE}}^{\diamond} $	$ {\rm{Oops!}}^{\diamond\ddagger} $	$ {\rm{ITNLP}}^{\diamond} $	$ {\text{YNU-HPCC}}^{\ddagger} $	$ {\rm{CA}}^{\diamond} $	DE	CA-tuned	$ {\rm{mGENRE}}^{\dagger} $
Hansel-FS	11.7	44.6	30.7	21.1	46.2	46.8	49.9	36.6
Hansel-ZS	71.6	81.6	81.7	73.6	76.6	78.2	83.5	68.4
CLEEK	—	—	—	—	—	58.7	70.5	73.7
注: 加粗字体表示各行最优结果.

下载: 导出CSV

表 4 实体链接大语言模型方法在Hansel和CLEEK数据集上的准确率(%)

Table 4 Accuracy of entity linking LLM methods on Hansel and CLEEK (%)

数据集	Qwen-7B				ChatGLM3-6B
数据集	CoT-KA	LoRA	P-tuning	ICL	CoT-KA	LoRA	P-tuning	ICL
Hansel-FS	53.8	61.7	56.7	49.2	52.5	51.6	47.2	50.6
Hansel-ZS	83.3	87.6	85.9	79.5	84.2	85.4	83.6	82.5
CLEEK	74.3	77.4	74.9	66.6	71.4	72.9	67.2	67.5
注: 加粗字体表示各行最优结果.

下载: 导出CSV

表 5 监督微调方法对准确率的影响 (%)

Table 5 Impact of supervised fine-tuning on accuracy (%)

微调方法	训练参数量	Top-1 准确率
微调方法	训练参数量	Hansel-FS	Hansel-ZS	CLEEK
P-tuning	10 M	56.7	85.9	74.9
AdaLoRA	27 M	60.4	87.3	77.5
LoRA	286 M	61.7	87.6	77.4
FT	7 B	53.9	85.7	73.7
注: 加粗字体表示在不同数据集上的最优结果.

下载: 导出CSV

表 6 大语言模型的无监督推理能力 (%)

Table 6 Unsupervised reasoning capabilities of LLMs (%)

方法	大语言模型	Top-1 准确率
方法	大语言模型	Hansel-FS	Hansel-ZS	CLEEK
ICL	Qwen-7B	49.2	79.5	66.6
	Qwen-14B	51.8	81.5	64.1
	Qwen-72B	62.2	86.8	74.8
	ChatGPT	52.7	79.4	66.4
CoT	Qwen-7B	49.8	78.9	67.6
	Qwen-14B	58.4	83.0	71.6
	Qwen-72B	63.1	85.6	75.0
	ChatGPT	55.4	78.7	67.9
CoT-KA	Qwen-7B	53.8	83.3	74.3
	Qwen-14B	60.1	86.3	75.6
	Qwen-72B	61.8	87.2	77.4
	ChatGPT	58.3	85.0	75.2
注: 加粗字体表示各组方法在不同数据集上的最优结果.

下载: 导出CSV

表 7 适配器的秩对准确率的影响 (%)

Table 7 Impact of adapter rank on accuracy (%)

秩	Top-1 准确率
秩	Hansel-FS	Hansel-ZS	CLEEK
$ r=1 $	53.8	85.4	72.4
$ r=2 $	52.9	85.1	71.3
$ r=4 $	54.5	86.0	73.0
$ r=8 $	58.8	87.4	76.5
$ r=64 $	61.4	87.3	77.4
$ r=128 $	61.7	87.6	77.4
注: 加粗字体表示在不同数据集上的最优结果.

下载: 导出CSV

表 8 虚拟提示长度对准确率的影响 (%)

Table 8 Impact of virtual prompt length on accuracy (%)

提示长度	Top-1 准确率
提示长度	Hansel-FS	Hansel-ZS	CLEEK
10	54.6	85.7	74.1
20	56.7	85.9	74.9
40	55.4	85.6	74.6
60	51.4	85.1	71.3
80	53.0	85.2	72.4
注: 加粗字体表示在不同数据集上的最优结果.

下载: 导出CSV

表 9 不同示例选择策略下的语境学习的准确率 (%)

Table 9 Accuracy of ICL under different example selection strategies (%)

模型	选择策略	Top-1 准确率
模型	选择策略	Hansel-FS	Hansel-ZS	CLEEK
Qwen-7B	Random	46.9	79.3	63.8
	BM25	46.4	79.2	63.3
	SBERT	49.2	79.5	66.6
ChatGLM3-6B	Random	48.6	82.3	66.4
	BM25	48.9	82.8	66.0
	SBERT	50.6	82.5	67.5
注: 加粗字体表示各列最优结果; 下划线字体表示各列次优结果.

下载: 导出CSV

表 10 知识增强方法的消融实验 (%)

Table 10 The ablation study on knowledge augmentation (%)

方法	Top-1 准确率
方法	Hansel-FS	Hansel-ZS	CLEEK
CA	46.7	82.7	70.1
CA+平衡负采样	49.9 + 3.2	83.5 + 0.8	70.5 + 0.4
CA+知识增强(Qwen-7B)	51.9 + 5.2	82.5 − 0.2	74.1 + 4.0
CA+知识增强(ChatGLM3-6B)	48.4 + 1.7	82.5 − 0.2	70.9 + 0.8
CA+平衡负采样+知识增强(Qwen-7B)	53.8 + 7.1	83.3 + 0.6	74.3 + 4.2
CA+平衡负采样+知识增强(ChatGLM3-6B)	52.5 + 5.8	84.2 + 1.5	71.4 + 1.3
注: 加粗字体表示各列最优结果; 下划线字体表示各列次优结果.

下载: 导出CSV

表 11 六种错误类型在不同方法中所占比例 (%)

Table 11 Proportions of 6 error types across different methods (%)

错误类型	DE	mGENRE	CA-tuned	CoT-KA		LoRA		P-tuning		ICL
错误类型	DE	mGENRE	CA-tuned	Qwen-7B	ChatGLM3-6B	Qwen-7B	ChatGLM3-6B	Qwen-7B	ChatGLM3-6B	Qwen-7B	ChatGLM3-6B
类别	24	26	35	22	21	31	29	30	21	25	19
粒度	23	32	28	36	33	41	29	42	36	36	33
全局	30	10	11	16	18	11	20	14	19	16	24
局部	4	14	6	8	11	3	10	6	14	9	11
时间	8	8	10	11	8	11	6	5	4	5	5
地点	11	9	11	7	9	4	7	4	6	9	8

下载: 导出CSV

表 12 错误样例: 包含上下文、预测实体、正确实体三个信息(中括号内的内容表示指称)

Table 12 Error cases: Contains context, predicted entity, and correct entity, with content in brackets indicating mention

错误种类		上下文	预测实体	正确实体
类别		由猫腻的同名小说改编而成的[《将夜》], 一播出就引起了网友们的关注.	将夜(小说): 《将夜》为网络作家猫腻发布于起点中文网的玄幻网络小说.	将夜(网络剧): Ever Night, 2018年播出的玄幻古装剧.
粒度	“特指”预测为 “泛指”	苹果提交的“通过动态属性而达到的3D[用户界面]显示效果”的专利就曾披露出其对眼部追踪技术的兴趣.	用户界面: User Interface, 简称UI, 是系统和用户之间进行交互和信息交换的媒介, 它实现信息的内部形式与人类可以接受形式之间的转换.	图形用户界面: Graphical User Interface,缩写: GUI, 是指采用图形方式显示的计算机操作界面.
	“泛指”预测为 “特指”	新华社11月8日电(记者: 肖世尧, 张华迎) 2019 [中国(福州)羽毛球公开赛] 8日展开1/4决赛的较量, 赛会卫冕冠军陈雨菲直落两局战胜泰国名将.	2019中国福州羽毛球公开赛: 第2届中国福州羽毛球公开赛, 是2019年世界羽联世界巡回赛的其中一站, 属于第三级别赛事.	中国福州羽毛球公开赛: 一项自2018年起成立、一年一度在中国福建省福州市仓山区举行的国际羽毛球公开锦标赛.
	“整体”预测为 “部分”	古田会议的精髓就是思想建党、政治建军. 85年前, 我军政治工作在古田奠基, 新型人民军队在[古田]定型.	古田会议会址: 位于福建省龙岩市上杭县古田镇, 1929年12月, 毛泽东主持的中国共产党红军第四军第九次代表大会(即古田会议)在此召开, 通过了具有历史意义的《古田会议决议》.	古田镇: 古田镇是福建省上杭县下辖的一个镇, 位于上杭县境东北部, 是2003年评定的第一批中国历史文化名镇之一. 境内有古田会议纪念馆.
	“部分”预测为 “整体”	在餐饮外卖行业, [美团]强调更多的玩家进入餐饮外卖市场对行业是好事, 这意味着蛋糕会越做越大.	美团: 美团是一家面向本地消费产品和零售服务(包括娱乐、餐饮、送货、旅行和其他服务)的中文购物平台. 旗下经营美团网、美团外卖、大众点评网、摩拜单车等互联网平台.	美团外卖: 美团外卖是中国生活服务网站美团网旗下的互联网外卖订餐平台, 由北京三快在线科技有限公司运营, 创立于2013年, 目前合作商户数超过200万家, 覆盖1300多个城市.
全局	指代错误	奥地利选手梅尔泽以7-6和6-1击败克罗地亚卡洛维奇. [梅尔泽]在半决赛中将对阵比利时名将奥−罗切斯.	莱昂纳多·梅耶尔: Leonardo Mayer, 出生于科连特斯, 是一位阿根廷男子职业网球运动员.	于尔根·梅尔策: 奥地利职业网球运动员. 于1999年转为职业选手. 单打最高世界排名是第9位.
	主题错误	摄影师从海南赶到茂名, 与当地一众天文爱好者一路[追星], 终于拍下了这颗绿色彗星.	追星族: 崇拜明星, 积极追随并关注与其有关事物的爱好者.	天文摄影: 天文摄影为一特殊的摄影技术, 可记录各种天体和天象、月球、行星甚至遥远的深空天体.
	角色错误	双方签署协议共同成立“[新闻与传播学院]院务委员会”. 北大将借助新华社的影响力, 建设国际传播研究智库, 打造教学实习和培养从业人员基地.	清华大学新闻与传播学院: 简称新闻学院、新传学院, 是清华大学直属的一个学院.	北京大学新闻与传播学院: 承担北京大学在新闻学和传播学领域教育与研究任务的一个直属学院.
局部		[《2019 MBC演技大赏》]于12月30日晚在首尔麻浦区上岩MBC举行, 由金成柱、韩惠珍主持.	2019 SBS演技大奖: 《2019 SBS演技大奖》为SBS于2019年度颁发的电视剧大奖.	2019 MBC演技大奖:《2019 MBC演技大奖》为MBC于2019年度颁发的电视剧大奖.
时间		当地时间2018年9月15日, 美国北卡罗来纳州, 飓风“[佛罗伦萨]”在美国北卡罗来纳州登陆.	2006年飓风佛罗伦萨: 飓风佛罗伦萨是2006年大西洋飓风季形成的第7场热带风暴和第2场飓风.	飓风佛罗伦斯(2018年): 飓风佛罗伦斯为2018年大西洋飓风季第6个被命名的热带气旋.
地点		该段起于11号线[左岭站] (不含), 终点位于葛店南站.	左岭站: 左岭站位于湖北省武汉市洪山区左岭镇, 是武黄城际铁路上的火车站, 武汉铁路局管辖.	左岭站(武汉地铁): 左岭站是武汉地铁11号线的一座车站, 位于武汉市洪山区.

下载: 导出CSV

参考文献(47)

[1]	郭浩, 李欣奕, 唐九阳, 郭延明, 赵翔. 自适应特征融合的多模态实体对齐研究. 自动化学报, 2024, 50(4): 758−770 Guo Hao, Li Xin-Yi, Tang Jiu-Yang, Guo Yan-Ming, Zhao Xiang. Adaptive feature fusion for multi-modal entity alignment. Acta Automatica Sinica, 2024, 50(4): 758−770
[2]	Hasibi F, Balog K, Bratsberg S E. Exploiting entity linking in queries for entity retrieval. In: Proceedings of the ACM International Conference on the Theory of Information Retrieval. Newark, Delaware, USA: Association for Computing Machinery, 2016. 209−218
[3]	刘琼昕, 王亚男, 龙航, 王佳升, 卢士帅. 基于全局覆盖机制与表示学习的生成式知识问答技术. 自动化学报, 2022, 48(10): 2392−2405 Liu Qiong-Xin, Wang Ya-Nan, Long Hang, Wang Jia-Sheng, Lu Shi-Shuai. Generative knowledge question answering technology based on global coverage mechanism and representation learning. Acta Automatica Sinica, 2022, 48(10): 2392−2405
[4]	Bai J Z, Bai S, Chu Y F, Cui Z Y, Dang K, Deng X D, et al. Qwen technical report. arXiv preprint arXiv: 2309.16609, 2023.
[5]	Zeng A H, Xu B, Wang B W, Zhang C H, Yin D, Zhang D, et al. ChatGLM: A family of large language models from GLM-130B to GLM-4 all tools. arXiv preprint arXiv: 2406.12793, 2024.
[6]	Brown T B, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. arXiv preprint arXiv: 2005.14165, 2020.
[7]	Rawte V, Chakraborty S, Pathak A, Sarkar A, Tonmoy S M T I, Chadha A, et al. The troubling emergence of hallucination in large language models——An extensive definition, quantification, and prescriptive remediations. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Singapore: Association for Computational Linguistics, 2023. 2541−2573
[8]	Roberts A, Raffel C, Shazeer N. How much knowledge can you pack into the parameters of a language model? In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Virtual Event: Association for Computational Linguistics, 2020. 5418−5426
[9]	Kandpal N, Deng H K, Roberts A, Wallace E, Raffel C. Large language models struggle to learn long-tail knowledge. arXiv preprint arXiv: 2211.08411, 2023.
[10]	Wang X T, Yang Q W, Qiu Y T, Liang J Q, He Q Y, Gu Z H, et al. KnowledGPT: Enhancing large language models with retrieval and storage access on knowledge bases. arXiv preprint arXiv: 2308.11761, 2023.
[11]	Ganea O-E, Hofmann T. Deep joint entity disambiguation with local neural attention. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Copenhagen, Denmark: Association for Computational Linguistics, 2017. 2619−2629
[12]	Le P, Titov I. Improving entity linking by modeling latent relations between mentions. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Melbourne, Australia: Association for Computational Linguistics, 2018. 1595−1604
[13]	蒋胜臣, 王红斌, 余正涛, 线岩团, 王红涛. 基于关系指数和表示学习的领域集成实体链接. 自动化学报, 2021, 47(10): 2376−2385 Jiang Sheng-Chen, Wang Hong-Bin, Yu Zheng-Tao, Xian Yan-Tuan, Wang Hong-Tao. Domain integrated-entity links based on relationship indices and representation learning. Acta Automatica Sinica, 2021, 47(10): 2376−2385
[14]	Devlin J, Chang M W, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, USA: Association for Computational Linguistics, 2019. 4171−4186
[15]	Yamada I, Washio K, Shindo H, Matsumoto Y. Global entity disambiguation with BERT. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, USA: Association for Computational Linguistics, 2022. 3264−3271
[16]	Wu L, Petroni F, Josifoski M, Riedel S, Zettlemoyer L. Scalable zero-shot entity linking with dense entity retrieval. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). Virtual Event: Association for Computational Linguistics, 2020. 6397−6407
[17]	Wei J, Wang X Z, Schuurmans D, Bosma M, Ichter B, Xia F, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 2022, 35: 24824−24837
[18]	Ouyang L, Wu J, Jiang X, Almeida D, Wainwright C, Mishkin P, et al. Training language models to follow instructions with human feedback. arXiv preprint arXiv: 2203.02155, 2022.
[19]	Talmor A, Tafjord O, Clark P, Goldberg Y, Berant J. Leap-of-thought: Teaching pre-trained models to systematically reason over implicit knowledge. Advances in Neural Information Processing Systems, 2020, 33: 20227−20237
[20]	Xu Z R, Shan Z F, Li Y X, Hu B T, Qin B. Hansel: A Chinese few-shot and zero-shot entity linking benchmark. In: Proceedings of the 16th ACM International Conference on Web Search and Data Mining. Singapore: Association for Computing Machinery, 2023. 832−840
[21]	Zeng W X, Zhao X, Tang J Y, Tan Z, Huang X Q. CLEEK: A Chinese long-text corpus for entity linking. In: Proceedings of the 12th Language Resources and Evaluation Conference. Marseille, France: European Language Resources Association, 2020. 2026−2035
[22]	Wu D J, Zhang J, Huang X M. Chain of thought prompting elicits knowledge augmentation. arXiv preprint arXiv: 2307.1640, 2023.
[23]	Humeau S, Shuster K, Lachaux M A, Weston J. Poly-encoders: Transformer architectures and pre-training strategies for fast and accurate multi-sentence scoring. arXiv preprint arXiv: 1905.01969, 2020.
[24]	Houlsby N, Giurgiu A, Jastrzebski S, Morrone B, Laroussilhe Q D, Gesmundo A, et al. Parameter-efficient transfer learning for NLP. arXiv preprint arXiv: 1902.00751, 2019.
[25]	Hu E J, Shen Y L, Wallis P, Allen-Zhu Z Y, Li Y Z, Wang S, et al. LoRA: Low-rank adaptation of large language models. arXiv preprint arXiv: 2106.09685, 2022.
[26]	Liu X, Zheng Y N, Du Z X, Ding M, Qian Y J, Yang Z L, et al. GPT understands, too. AI Open, 2024, 5: 208−215 doi: 10.1016/j.aiopen.2023.08.012
[27]	Liu J C, Shen D H, Zhang Y Z, Dolan B, Carin L, Chen W Z. What makes good in-context examples for GPT-3? In: Proceedings of the Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures. Dublin, Ireland: Association for Computational Linguistics, 2022. 100−114
[28]	Reimers N, Gurevych I. Sentence-BERT: Sentence embeddings using siamese BERT-networks. arXiv preprint arXiv: 1908.10084, 2019.
[29]	Huang S J, Wang B B, Qin L B, Zhao Q, Xu R F. Improving few-shot and zero-shot entity linking with coarse-to-fine lexicon-based retriever. In: Processings of the Natural Language Processing and Chinese Computing: 12th National CCF Conference. Foshan, China: 2023. 245−256
[30]	Zhou H Y, Sun C J, Lin L, Shan L L. ERNIE-AT-CEL: A Chinese few-shot emerging entity linking model based on ERNIE and adversarial training. In: Processings of the Natural Language Processing and Chinese Computing: 12th National CCF Conference. Foshan, China: 2023. 48−56
[31]	Xu Z, Shan Z, Hu B, Zhang M. Overview of the NLPCC 2023 shared task 6: Chinese few-shot and zero-shot entity linking. In: Proceedings of Natural Language Processing and Chinese Computing: 12th National CCF Conference. Foshan, China: 2023. 257−265
[32]	de Cao N, Wu L, Popat K, Artetxe M, Goyal N, Plekhanov M, et al. Multilingual autoregressive entity linking. Transactions of the Association for Computational Linguistics, 2022, 10: 274−290 doi: 10.1162/tacl_a_00460
[33]	Zhang Q R, Chen M S, Bukharin A, He P C, Cheng Y, Chen W Z, et al. Adaptive budget allocation for parameter-efficient fine-tuning. arXiv preprint arXiv: 2303.10512v1, 2023.
[34]	Wei J, Tay Y, Bommasani R, Raffel C, Zoph B, Borgeaud S, et al. Emergent abilities of large language models [Online], available: https://openreview.net/forum?id=yzkSU5zdwD, January 15, 2025
[35]	Bonifacio L, Abonizio H, Fadaee M, Nogueira R. InPars: Unsupervised dataset generation for information retrieval. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. Madrid, Spain: Association for Computing Machinery, 2022. 2387−2392
[36]	Ferraretto F, Laitz T, Lotufo R, Nogueira R. ExaRanker: Synthetic explanations improve neural rankers. In: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. Taipei, China: Association for Computing Machinery, 2023. 2409−2414
[37]	Dettmers T, Pagnoni A, Holtzman A, Zettlemoyer L. QLoRA: Efficient finetuning of quantized LLMs. arXiv preprint arXiv: 2305.14314, 2023.
[38]	Li X L, Liang P. Prefix-tuning: Optimizing continuous prompts for generation. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Virtual Event: Association for Computational Linguistics, 2021. 4582−4597
[39]	Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Virtual Event: Punta Cana, Dominican Republic: Association for Computational Linguistics, 2021. 3045−3059
[40]	Liu X, Ji K X, Fu Y C, Tam W L, Du Z X, Yang Z L, et al. P-Tuning: Prompt tuning can be comparable to fine-tuning across scales and tasks. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Dublin, Ireland: Association for Computational Linguistics, 2022. 61−68
[41]	Lu Y, Bartolo M, Moore A, Riedel S, Stenetorp P. Fantastically ordered prompts and where to find them: Overcoming few-shot prompt order sensitivity. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland: Association for Computational Linguistics, 2022. 8086−8098
[42]	Zhao T, Wallace E, Feng S, Klein D, Singh S. Calibrate before use: Improving few-shot performance of language models. arXiv preprint arXiv: 2102.09690, 2021.
[43]	Rubin O, Herzig J, Berant J. Learning to retrieve prompts for in-context learning. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, USA: Association for Computational Linguistics, 2022. 2655−2671
[44]	Min S, Lewis M, Zettlemoyer L, Hajishirzi H. MetaICL: Learning to learn in context. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Seattle, USA: Association for Computational Linguistics, 2022. 2791−2809
[45]	Cho Y M, Zhang L, Callison-Burch C. Unsupervised entity linking with guided summarization and multiple-choice selection. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, 2022. 9394−9401
[46]	Shi S B, Xu Z R, Hu B T, Zhang M. Generative multimodal entity linking. In: Proceedings of the Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). Torino, Italy: ELRA and ICCL, 2024. 7654−7665
[47]	de Cao N, Izacard G, Riedel S, Petroni F. Autoregressive entity retrieval. In: Proceedings of the 9th International Conference on Learning Representations. Virtual Event: ICLR, 2021.