2.845

2023影响因子

(CJCR)

  • 中文核心
  • EI
  • 中国科技核心
  • Scopus
  • CSCD
  • 英国科学文摘

留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码
田永林, 王雨桐, 王兴霞, 杨静, 沈甜雨, 王建功, 范丽丽, 郭超, 王寿文, 赵勇, 武万森, 王飞跃. 从RAG到SAGE: 现状与展望. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240163
引用本文: 田永林, 王雨桐, 王兴霞, 杨静, 沈甜雨, 王建功, 范丽丽, 郭超, 王寿文, 赵勇, 武万森, 王飞跃. 从RAG到SAGE: 现状与展望. 自动化学报, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240163
Tian Yong-Lin, Wang Yu-Tong, Wang Xing-Xia, Yang Jing, Shen Tian-Yu, Wang Jian-Gong, Fan Li-Li, Guo Chao, Wang Shou-Wen, Zhao Yong, Wu Wan-Sen, Wang Fei-Yue. From retrieval-augmented generation to sage: the state of the art and prospects. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240163
Citation: Tian Yong-Lin, Wang Yu-Tong, Wang Xing-Xia, Yang Jing, Shen Tian-Yu, Wang Jian-Gong, Fan Li-Li, Guo Chao, Wang Shou-Wen, Zhao Yong, Wu Wan-Sen, Wang Fei-Yue. From retrieval-augmented generation to sage: the state of the art and prospects. Acta Automatica Sinica, xxxx, xx(x): x−xx doi: 10.16383/j.aas.c240163

从RAG到SAGE: 现状与展望

doi: 10.16383/j.aas.c240163 cstr: 32138.14.j.aas.c240163
基金项目: 国家自然科学基金青年基金(62303460), 澳门特别行政区科学技术发展基金(0145/2023/RIA3), 中国科协青年人才托举工程(YESS20220372)资助
详细信息
    作者简介:

    田永林:中国科学院自动化研究所多模态人工智能系统全国重点实验室助理研究员. 2022年获得中国科学技术大学自动化系博士学位. 主要研究方向为平行智能, 自动驾驶, 智能交通. E-mail: yonglin.tian@ia.ac.cn

    王雨桐:中国科学院自动化研究所多模态人工智能系统全国重点实验室副研究员. 2021年获得中国科学院大学控制理论与控制工程专业博士学位. 主要研究方向为计算机视觉, 智能感知. E-mail: yutong.wang@ia.ac.cn

    王兴霞:中国科学院自动化研究所多模态人工智能系统全国重点实验室博士研究生. 2021 年获得南开大学工学硕士学位. 主要研究方向为平行智能, 平行油田, 多智能体系统. E-mail: wangxingxia2022@ia.ac.cn

    杨静:中国科学院自动化研究所多模态人工智能系统全国重点实验室博士研究生. 2020年获得北京化工大学自动化学士学位. 主要研究方向为众包, 平行制造, 社会制造, 预训练语言模型和社会物理信息系统. E-mail: yangjing2020@ia.ac.cn

    沈甜雨:北京化工大学信息科学与技术学院副教授. 2021年获得中国科学院自动化研究所工学博士学位. 主要研究方向为智能感知与智能机器人系统. E-mail: tianyu.shen@buct.edu.cn

    王建功:中国科学院自动化研究所博士研究生. 2018年获得同济大学学士学位. 主要研究方向为计算机视觉, 交通场景理解, 医学图像处理. E-mail: wangjiangong2018@ia.ac.cn

    范丽丽:北京理工大学信息与电子学院博士后, 2022年获得吉林大学博士学位, 主要研究方向为计算机视觉、跨模态感知与理解、类脑认知与决策. E-mail: lilifan@bit.edu.cn

    郭超:中国科学院自动化研究所助理研究员. 主要研究方向为机器艺术创作、人机协作、智能机器人系统、机器学习、强化学习. E-mail: chao.guo@ia.ac.cn

    王寿文:澳门科技大学创新工程学院智能科学与系统专业博士生, 主要研究方向为智能系统和复杂系统的建模、分析与控制. E-mail: 2109853pmi3004@student.must.edu.mo

    赵勇:国防科技大学系统工程学院博士研究生. 2021年获得国防科技大学控制科学与工程硕士学位, 主要研究方向是群智感知和人机交互. E-mail: zhaoyong15@nudt.edu.cn

    武万森:国防科技大学系统工程学院博士研究生. 2018年获国防科技大学学士学位. 主要研究方向为视觉语言多模态. E-mail: wuwansen14@nudt.edu.cn

    王飞跃:中国科学院自动化研究所复杂系统管理与控制国家重点实验室研究员. 主要研究方向为智能系统和复杂系统的建模、分析与控制. 本文通信作者. E-mail: feiyue.wang@ia.ac.cn

From retrieval-augmented generation to SAGE: The state of the art and prospects

Funds: Supported by the National Natural Science Foundation of China (62303460), the Science and Technology Development Fund of Macau SAR (0145/2023/RIA3), and the Young Elite Scientists Sponsorship Program of China Association of Science and Technology (YESS20220372)
More Information
    Author Bio:

    TIAN Yong-Lin Assistant researcher at The State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China. He received his Ph.D. degree in control science and engineering from the University of Science and Technology of China, Hefei, China, in 2022. His research interests include parallel intelligence, autonomous driving, and intelligent transportation systems

    WANG Yu-Tong Associate researcher at The State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences. She received her Ph. D. degree in control theory and control engineering from University of Chinese Academy of Sciences in 2021. Her research interest covers computer vision and intelligent perception

    WANG Xing-Xia Ph. D. candidate at the State Key Laboratory for Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences. She received her master degree in engineering from Nankai University in 2021. Her research interest covers parallel control, parallel oilfields, and multi-agent systems

    YANG Jing Ph.D. candidate at the State Key Laboratory for Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences. She received her bachelor degree in automation from Beijing University of Chemical Technology in 2020. Her research interest covers crowdsourcing, parallel manufacturing, social manufacturing, pre-trained language models, and cyber-physical-social systems

    SHEN Tian-Yu Associate Professor at College of Information Science and Technology, Beijing University of Chemical Technology. She received the Ph.D. degree from the Institute of Automation, Chinese Academy of Sciences in 2021. Her research interest covers intelligent perception and intelligent unmanned systems

    WANG Jian-Gong Ph. D. candidate in Institute of Automation, Chinese Academy of Sciences. He received his bachelor degree from Tongji University in 2018. His research interest covers computer vision, traffic scene understanding and medical image processing

    Fan Li-Li Postdoctoral fellow at the School of Information and Electronics, Beijing Institute of Technology. She obtained her Ph. D. degree from Jilin University in 2022, with a major research focus on computer vision, cross-modal perception and understanding, and neuromorphic cognition and decision-making

    GUO Chao Assistant professor with the Institute of Automation, Chinese Academy of Sciences, Beijing, China. His research interests include AI for art creation, human-machine collaboration, intelligent robotic systems, machine learning, and reinforcement learning

    WANG Shou-wen Ph.D. candidate at Faculty of Innovation Engineering, Macau University of Science and Technology. His research interest covers modeling, analysis and control of intelligent systems and complex systems

    Zhao Yong  Ph.D. candidate, College of Systems Engineering, National University of Defense Technology. He received his master's degree in Control Science and Engineering from National University of Defense Technology in 2021. His main research interests are Crowdsensing and human-computer interaction

    WU Wan-Sen Ph.D. candidate at the College of Systems Engineering, National University of Defense Technology. He received his bachelor degree from National University of Defense Technology in 2018. His research interest covers vision-and-language multi-modality and robot

    WANG Fei-Yue Professor at the State Key Laboratory for Management and Control of Complex Systems, Institute of Automation, Chinese Academy of Sciences. His research interest covers modeling, analysis, and control of intelligent systems and complex systems. Corresponding author of this paper

  • 摘要: 大模型技术提升了人们获取和利用知识的效率, 但在实际应用中仍然面临着知识受限、迁移障碍和幻觉等问题, 阻碍了可信可靠人工智能系统的构建. 检索增强生成方法(Retrieval-augmented generation, RAG)通过外接知识库和查询关联的检索有效提升了大模型的知识和能力水平, 为大模型掌握实时型、行业型及私有型知识提供了有力支撑, 进而加速了大模型技术向多样场景的推广与落地. 本文围绕检索增强生成技术, 阐述了其基本原理、发展现状及典型应用, 并分析了其优势和面临的挑战. 在此基础上, 引入搜索和缓存管理思想, 提出了RAG的拓展框架SAGE (Search-augmented generation and extension), 以建立更加灵活和高效的大模型知识外挂工具链.
    1)  11 https://github.com/gkamradt/LLMTest_NeedleInAHaystack
  • 图  1  基础型RAG的结构图

    Fig.  1  The framework of Naive RAG

    图  2  RAG关键技术

    Fig.  2  Key technologies in RAG

    图  3  DenseX Retrieval方法[41]的原理图

    Fig.  3  The framework of DenseX Retrieval[41]

    图  4  BEG-LM方法[63]的原理图

    Fig.  4  The framework of BGE-LM[63]

    图  5  Query-Rewriter方法[80]的原理图

    Fig.  5  The framework of Query-Rewriter[80]

    图  6  Query2doc方法[84]的原理图

    Fig.  6  The framework of Query2doc[84]

    图  7  Self-RAG方法[85]的原理图

    Fig.  7  The framework of Self-RAG[85]

    图  8  RichRAG方法[95]的原理图

    Fig.  8  The framework of RichRAG[95]

    图  9  SAGE框架

    Fig.  9  The framework of SAGE

    表  1  RAG综述文章总结与对比

    Table  1  The comparison of surveys on RAG

    文献 年份 RAG技术点 RAG应用领域 RAG平台 新架构
    文献[19] 2024 检索、生成 NLP $ \times $ $ \times $
    文献[12] 2024 架构、学习、检索 NLP及下游应用 $ \times $ $ \times $
    文献[20] 2024 检索、生成、搜索 NLP $ \times $ $ \times $
    文献[21] 2023 架构、检索、生成 NLP及下游应用 $ \times $ Module RAG
    文献[22] 2022 检索、生成 NLP及下游应用 $ \times $ $ \times $
    文献[23] 2024 检索、生成 NLP及下游应用 $ \times $ $ \times $
    本文 2024 知识库、检索、生成 NLP、CV、垂直应用 $ \checkmark$ SAGE
    下载: 导出CSV

    表  2  基于RAG的应用案例

    Table  2  RAG-based Applications

    方法 应用领域 RAG作用 方法介绍
    UniMS-RAG[87] 通用对话 个性化 知识库阶段, 构建人物角色库与上下文语料库
    ERAGent[104] 通用对话 个性化 生成阶段, 使用人物角色资料作为提示构建的输入
    HyKGE[105] 医疗问答 专业化 检索阶段, 基于医学知识图谱增强医学知识理解
    CBR-RAG[106] 法律问答 专业化 数据库阶段, 基于法律案例库增强法学知识理解
    uRAG[107], SEA[108] 通用对话 实时化 检索阶段, 基于搜索引擎的RAG系统
    RA-VQA[109], KAT[110] 视觉问答 知识增强 生成阶段, 基于检索的知识增强视觉推理能力
    Plug-and-Play[111], MuRAG[112] 图像描述 知识增强 生成阶段, 基于检索的知识增强视觉推理能力
    RA-CM3[113], Re-Imagen[114] 图像生成 知识增强 生成阶段, 基于检索的知识丰富上下文信息
    RAC[115] 图像分类 长尾分布 生成阶段, 融合原始图像和检索内容特征
    文献[116], Make-An-Audio[117] 语音翻译 数据增强 基于检索构建多样化样本
    RAG-Driver[118], 文献[119] 自动驾驶 可解释性 生成阶段, 基于RAG提取相似场景案例
    下载: 导出CSV

    表  3  RAG开源平台

    Table  3  Open-source platforms of RAG

    名称 发布日期 特点 链接
    LangChain 2022.10 功能多样, 可拓展性强 https://github.com/langchain-ai/langchain
    LlamaIndex 2023.05 数据搜索检索效率高 https://github.com/jerryjliu/llama_index
    HayStack 2019.11 侧重文本检索和问答应用开发 https://github.com/deepset-ai/haystack
    Embedchain 2023.07 轻量化, 灵活, 可拓展性强 https://github.com/mem0ai/embedchainjs
    NeumAI 2023.12 高吞吐分布式架构 https://github.com/NeumTry/NeumAI
    GraphRAG 2023.07 知识图谱增强的全面信息理解 https://github.com/microsoft/graphrag
    Quivr 2023.05 基于LangChain的知识库应用平台 https://github.com/QuivrHQ/quivr
    Dify 2023.05 生成式AI开发框架 https://github.com/langgenius/dify
    RagFlow 2024.07 自动化RAG构建, 流程精简 https://github.com/infiniflow/ragflow
    Open-WebUI 2024.02 支持友好界面以及完全离线运行 https://github.com/open-webui/open-webui
    下载: 导出CSV

    表  4  中英文术语对照表

    Table  4  Glossary of Chinese-English terms

    中文名称 英文名称
    检索增强生成技术 Retrieval-augmented generation, RAG[1923]
    大语言模型 Large language model, LLM[58]
    自然语言处理 Natural language processing, NLP
    计算机视觉 Computation vision, CV
    数据分块 Data chunking[3841]
    独热编码 One-hot encoding[49]
    词袋模型 Bag of words, BOW[50]
    词频-逆向文件频率 Term frequency-inverse document frequency, TF-IDF[51]
    N元模型 N-Gram[52]
    海量文本语义向量基准测试 Massive text embedding benchmark, MTEB[55]
    退后提示 Step back prompting[81]
    多路召回 Multi query retrieval[82]
    假想文档嵌入 Hypothetical document embeddings, HyDE[8384]
    外部知识视觉问答任务 Outside knowledge visual question answering, OKVQA[110]
    思维链 Chain of thought, CoT[79]
    搜索增强的生成与扩展技术 Search-augmented generation and extension, SAGE
    下载: 导出CSV

    表  5  基于RAG的FLARE方法[86]与无检索基线方法的实验结果对比

    Table  5  Comparison of experimental results between the RAG-based FLARE method[86] and the non-retrieval baseline method

    数据集指标StrategyQAASQAASQA-hintWikiAsp
    指标EMEMD-F1R-LDREMD-F1R-LDRUniEval[166]E-F1R-L
    无检索72.933.824.233.328.440.132.536.434.447.114.126.4
    FLARE77.341.328.234.331.146.236.737.737.253.418.927.6
    下载: 导出CSV
  • [1] 田永林, 王雨桐, 王建功, 等. 视觉Transformer研究的关键问题: 现状及展望. 自动化学报, 2022, 48(4): 957−979 doi: 10.16383/j.aas.c220027

    Tian Yong-Lin, Wang Yu-Tong, Wang Jian-Gong, Wang Xiao, Wang Fei-Yue. Key problems and progress of vision Transformers: The state of the art and prospects. Acta Automatica Sinica, 2022, 48(4): 957−979 doi: 10.16383/j.aas.c220027
    [2] Casper S, Davies X, Shi C, et al. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback[J]. Transactions on Machine Learning Research, 2023.
    [3] Croitoru F A, Hondru V, Ionescu R T, et al. Diffusion models in vision: A survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
    [4] Muennighoff N, Rush A, Barak B, et al. Scaling data-constrained language models[J]. Advances in Neural Information Processing Systems, 2024, 36.
    [5] Wang Y, Pan Y, Yan M, et al. A survey on ChatGPT: AI-generated contents, challenges, and solutions[J]. IEEE Open Journal of the Computer Society, 2023.
    [6] 王晓, 张翔宇, 周锐, 田永林, 王建功, 陈龙, 孙长银. 基于平行测试的认知自动驾驶智能架构研究. 自动化学报, 2024, 50(2): 356−371

    Wang Xiao, Zhang Xiang-Yu, Zhou Rui, Tian Yong-Lin, Wang Jian-Gong, Chen Long, Sun Chang-Yin. An intelligent architecture for cognitive autonomous driving based on parallel testing. Acta Automatica Sinica, 2024, 50(2): 356−371
    [7] Fan L, Guo C, Tian Y, et al. Sora for foundation robots with parallel intelligence: Three world models, three robotic systems. Frontiers of Information Technology & Electronic Engineering, 20241−7
    [8] Moor M, Banerjee O, Abad Z S H, et al. Foundation models for generalist medical artificial intelligence. Nature, 2023, 616(7956): 259−265 doi: 10.1038/s41586-023-05881-4
    [9] 卢经纬, 郭超, 戴星原, 等. 问答ChatGPT之后: 超大预训练模型的机遇和挑战. 自动化学报, 2023, 49(4): 705−717 doi: 10.16383/j.aas.c230107

    Lu Jing-Wei, Guo Chao, Dai Xing-Yuan, Miao Qing-Hai, Wang Xing-Xia, Yang Jing, Wang Fei-Yue. The ChatGPT after: Opportunities and challenges of very large scale pre-trained models. Acta Automatica Sinica, 2023, 49(4): 705−717 doi: 10.16383/j.aas.c230107
    [10] Currie G M. Academic integrity and artificial intelligence: is ChatGPT hype, hero or heresy?[C]//Seminars in Nuclear Medicine. WB Saunders, 2023.
    [11] Hirano Y, Hanaoka S, Nakao T, et al. GPT-4 Turbo with Vision fails to outperform text-only GPT-4 Turbo in the Japan Diagnostic Radiology Board Examination. Japanese Journal of Radiology, 20241−9
    [12] Ding Y, Fan W, Ning L, et al. A Survey on RAG Meets LLM: Towards Retrieval-Augmented Large Language Models[J]. arXiv preprint arXiv: 2405.06211, 2024.
    [13] Xiong G, Jin Q, Lu Z, et al. Benchmarking retrieval-augmented generation for medicine[J]. arXiv preprint arXiv: 2402.13178, 2024.
    [14] Zhao A, Huang D, Xu Q, et al. Expel: Llm agents are experiential learners[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2024, 38(17): 19632-19642.
    [15] Zhai Y, Tong S, Li X, et al. Investigating the Catastrophic Forgetting in Multimodal Large Language Model Fine-Tuning[C]//Conference on Parsimony and Learning. PMLR, 2024: 202-227.
    [16] Gupta S, Jegelka S, Lopez-Paz D, et al. Context is Environment[C]//The Twelfth International Conference on Learning Representations. 2023.
    [17] Ji Z, Yu T, Xu Y, et al. Towards mitigating LLM hallucination via self reflection[C]//Findings of the Association for Computational Linguistics: EMNLP 2023. 2023: 1827-1843.
    [18] Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Kuttler H, Lewis M, Yih WT, Rocktaschel T, Riedel S. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 2020, 33: 9459−74
    [19] Huang Y, Huang J. A Survey on Retrieval-Augmented Text Generation for Large Language Models[J]. arXiv preprint arXiv: 2404.10981, 2024.
    [20] Zhu Y, Yuan H, Wang S, Liu J, Liu W, Deng C, Dou Z, Wen JR. Large language models for information retrieval: A survey. arXiv preprint arXiv: 2308.07107 (2023).
    [21] Gao Y, Xiong Y, Gao X, Jia K, Pan J, Bi Y, Dai Y, Sun J, Wang H. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint arXiv: 2312.10997, 2023.
    [22] Li H, Su Y, Cai D, Wang Y, Liu L. "A survey on retrieval-augmented text generation" arXiv preprint arXiv: 2202.01110, 2022.
    [23] Hu Y, Lu Y. RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language Processing[J]. arXiv preprint arXiv: 2404.19543, 2024.
    [24] 田永林, 王兴霞, 王雨桐, 等. RAG-PHI: 检索增强生成驱动的平行人与平行智能. 智能科学与技术学报, 2024, 6(1): 41−51

    Tian Yong-Lin, Wang Xing-Xia, Wang Yu-Tong, Wang Jian-Gong, Guo Chao, Fan Li-Li, Shen Tian-Yu, Wu Wan-Sen, Zhang Hong-Mei, Zhu Zheng-Qiu, Wang Fei-Yue. RAG-PHI: RAG-driven parallel human and parallel intelligence. Chinese Journal of Intelligent Science and Technology, 2024, 6(1): 41−51
    [25] Kaddour J, Harris J, Mozes M, Bradley H, Raileanu R, McHardy R. Challenges and applications of large language models[J]. arXiv preprint arXiv: 2307.10169, 2023.
    [26] Dai X, Guo C, Tang Y, et al. VistaRAG: Toward Safe and Trustworthy Autonomous Driving Through Retrieval-Augmented Generation[J]. IEEE Transactions on Intelligent Vehicles, 2024.
    [27] Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Frontiers in artificial intelligence. 2023 May 4;6: 1169595.
    [28] Louis A, van Dijck G, Spanakis G. Interpretable long-form legal question answering with retrieval-augmented large language models. InProceedings of the AAAI Conference on Artificial Intelligence 2024 Mar 24 (Vol. 38, No. 20, pp. 22266-22275).
    [29] Wu S, Irsoy O, Lu S, Dabravolski V, Dredze M, Gehrmann S, Kambadur P, Rosenberg D, Mann G. Bloomberggpt: A large language model for finance[J]. arXiv preprint arXiv: 2303.17564, 2023.
    [30] 王飞跃, 王艳芬, 陈薏竹, 田永林, 齐红威, 王晓, 张卫山, 张俊, 袁勇. 联邦生态: 从联邦数据到联邦智能. 智能科学与技术学报, 2020, 2(4): 305−313 doi: 10.11959/j.issn.2096-6652.202033

    Wang Fei-Yue, Wang Yan-Fen, Chen Yi-Zhu, Tian Yong-Lin, Qi Hong-Wei, Wang Xiao, Zhang Wei-Shan, Zhang Jun, Yuan Yong. Federated ecology: from federated data to federated intelligence. Chinese Journal of Intelligent Science and Technology, 2020, 2(4): 305−313 doi: 10.11959/j.issn.2096-6652.202033
    [31] Team G, Anil R, Borgeaud S, Wu Y, Alayrac JB, Yu J, Soricut R, Schalkwyk J, Dai AM, Hauth A, Millican K. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv: 2312.11805. 2023 Dec 19.
    [32] Lewis M, Liu Y, Goyal N, et al. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension[J]. arXiv preprint arXiv: 1910.13461, 2019.
    [33] Lee JD, Toutanova K. Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv: 1810.04805. 2018;3: 8.
    [34] H. S. Wang Yuxin Sun Qingxuan, M3E: Moka Massive Mixed Embedding Model. 2023.
    [35] Neelakantan A, Xu T, Puri R, et al. Text and code embeddings by contrastive pre-training[J]. arXiv preprint arXiv: 2201.10005, 2022.
    [36] Karpukhin V, Oguz B, Min S, et al. Dense passage retrieval for open-domain question answering[J]. arXiv preprint arXiv: 2004.04906, 2020.
    [37] Yang Z, Qi P, Zhang S, et al. HotpotQA: A dataset for diverse, explainable multi-hop question answering[J]. arXiv preprint arXiv: 1809.09600, 2018.
    [38] Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, Kuttler H, Lewis M, Yih WT, Rocktaschel T, Riedel S. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems, 2020, 33: 9459−74
    [39] Chen HY, Yu H. Intent-based Web Page Summarization with Structure-Aware Chunking and Generative Language Models. InCompanion Proceedings of the ACM Web Conference 2023 2023 Apr 30 (pp. 310-313).
    [40] Xiao S, Liu Z, Zhang P, Muennighof N. C-pack: Packaged resources to advance general chinese embedding. arXiv preprint arXiv: 2309.07597. 2023 Sep 14.
    [41] Chen T, Wang H, Chen S, Yu W, Ma K, Zhao X, Yu D, Zhang H. Dense X Retrieval: What Retrieval Granularity Should We Use?[J]. arXiv preprint arXiv: 2312.06648, 2023.
    [42] Chung, Hyung Won, et al. "Scaling instruction-finetuned language models." Journal of Machine Learning Research 25.70 (2024): 1-53.
    [43] Manning C, Schutze H. Foundations of statistical natural language processing[M]. MIT press, 1999.
    [44] Zhang, J., Wang, J., & Wang, J. Text chunking based on non-overlapping contexts. Expert Systems with Applications, 2010, 37(9): 6523−6529
    [45] Barzilay, R., & Elhadad, M. Using lexical chains for text summarization. In Proceedings of the ACL Workshop on Intelligent Scalable Text Summarization pp. 10-17, 1997.
    [46] Nakano, M., Yamamoto, H., & Nishida, T. Extracting context structure of text documents using discourse information. In Proceedings of the 2004 ACM symposium on Applied computing, pp. 1423-1428, 2004.
    [47] Lin, C. Y. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out, pp. 74-81, 2004.
    [48] Wan, X. Using bilingual knowledge and ensemble techniques for unsupervised Chinese word segmentation. In Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1, pp. 993-1000, 2008.
    [49] Rodriguez P, Bautista M A, Gonzalez J, et al. Beyond one-hot encoding: Lower dimensional target embedding. Image and Vision Computing, 2018, 75: 21−31 doi: 10.1016/j.imavis.2018.04.004
    [50] Zhang Y, Jin R, Zhou Z H. Understanding bag-of-words model: a statistical framework. International journal of machine learning and cybernetics, 2010, 1: 43−52 doi: 10.1007/s13042-010-0001-0
    [51] Chowdhury, Gobinda G. Introduction to modern information retrieval. Facet publishing, 2010.
    [52] Kondrak G. N-gram similarity and distance[C]//International symposium on string processing and information retrieval. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005: 115-126.
    [53] Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space[J]. arXiv preprint arXiv: 1301.3781, 2013.
    [54] Pennington J, Socher R, Manning CD. Manning. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543.
    [55] Muennighoff N, Tazi N, Magne L, et al. MTEB: Massive Text Embedding Benchmark[C]//Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. 2023: 2014-2037.
    [56] Meng R, Liu Y, Joty SR, Xiong C, Zhou Y, Yavuz S. Sfrembedding-mistral: enhance text retrieval with transfer learning. Salesforce AI Research Blog. 2024. https://blog.salesforceairesearch.com/sfr-embedded-mistral/
    [57] Muennighoff N, Su H, Wang L, Yang N, Wei F, Yu T, Singh A, Kiela D. Generative Representational Instruction Tuning[J]. arXiv preprint arXiv: 2402.09906, 2024.
    [58] Wang L, Yang N, Huang X, Yang L, Majumder R, Wei F. Improving text embeddings with large language models[J]. arXiv preprint arXiv: 2401.00368, 2023.
    [59] Yang A, Xiao B, Wang B, Zhang B, Bian C, Yin C, Lv C, Pan D, Wang D, Yan D, Yang F. Baichuan 2: Open large-scale language models[J]. arXiv preprint arXiv: 2309.10305, 2023.
    [60] Chen J, Xiao S, Zhang P, Luo K, Lian D, Liu Z. Bge m3-embedding: Multi-lingual, multi-functionality, multi-granularity text embeddings through self-knowledge distillation[J]. arXiv preprint arXiv: 2402.03216, 2024.
    [61] H. S. Wang Yuxin Sun Qingxuan, M3E: Moka Massive Mixed Embedding Model. 2023.
    [62] Chen, J., Lin, H., Han, X., & Sun, L. Benchmarking Large Language Models in Retrieval-Augmented Generation. Proceedings of the AAAI Conference on Artificial Intelligence, 2024, 38(16): 17754−17762 doi: 10.1609/aaai.v38i16.29728
    [63] Luo K, Liu Z, Xiao S, et al. BGE Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models[J]. arXiv preprint arXiv: 2402.11573, 2024.
    [64] Touvron H, Lavril T, Izacard G, et al. Llama: Open and efficient foundation language models[J]. arXiv preprint arXiv: 2302.13971, 2023.
    [65] He K, Zhang X, Ren S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
    [66] Tan M, Le Q. Efficientnet: Rethinking model scaling for convolutional neural networks[C]//International conference on machine learning. PMLR, 2019: 6105-6114.
    [67] Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G. "Learning transferable visual models from natural language supervision, " in International conference on machine learning 2021 Jul 1 (pp. 8748-8763). PMLR.
    [68] Tsalera E, Papadakis A, Samarakou M. Comparison of pre-trained CNNs for audio classification using transfer learning. Journal of Sensor and Actuator Networks, 2021, 10(4): 72 doi: 10.3390/jsan10040072
    [69] Yuan Y, Xun G, Suo Q, et al. Wave2vec: Learning deep representations for biosignals[C]//2017 IEEE International Conference on Data Mining (ICDM). IEEE, 2017: 1159-1164.
    [70] Lin G, Zhang Y, Xu G, et al. Smoke detection on video sequences using 3D convolutional neural networks. Fire Technology, 2019, 55: 1827−1847 doi: 10.1007/s10694-019-00832-w
    [71] Bertasius G, Wang H, Torresani L. Is space-time attention all you need for video understanding?[C]//ICML. 2021, 2(3): 4.
    [72] Robertson, Stephen, and Hugo Zaragoza. "The probabilistic relevance framework: BM25 and beyond." Foundations and Trends in Information Retrieval 3.4 (2009): 333-389.
    [73] McCandless, Michael, et al. Lucene in action. Vol. 2. Greenwich: Manning, 2010.
    [74] Gormley, Clinton, and Zachary Tong. Elasticsearch: the definitive guide: a distributed real-time search and analytics engine. " O'Reilly Media, Inc.", 2015.
    [75] Chang, Xinyuan. "The Analysis of Open Source Search Engines." Highlights in Science, Engineering and Technology 32 (2023): 32-42.
    [76] Kumar R, Sharma SC. "Smart information retrieval using query transformation based on ontology and semantic association."International Journal of Advanced Computer Science and Applications, vol. 13, no. 4, 2022.
    [77] Singh J, Prasad M, Prasad OK, Meng Joo E, Saxena AK, Lin CT. A novel fuzzy logic model for pseudo-relevance feedback-based query expansion. International Journal of Fuzzy Systems, 2016, 18: 980−989 doi: 10.1007/s40815-016-0254-1
    [78] Kim L, Yahia E, Segonds F, Véron P, Mallet A. i-Dataquest: A heterogeneous information retrieval tool using data graph for the manufacturing industry. Computers in Industry. 2021 Nov 1;132: 103527.
    [79] Wei J, Wang X, Schuurmans D, et al. Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, 2022, 35: 24824−24837
    [80] Ma X, Gong Y, He P, et al. Query Rewriting in Retrieval-Augmented Large Language Models[C]//The 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
    [81] Zheng H S, Mishra S, Chen X, et al. Step-Back Prompting Enables Reasoning Via Abstraction in Large Language Models[C]//The Twelfth International Conference on Learning Representations. 2023.
    [82] Wang Z, Wu Y, Narasimhan K, et al. Multi-query video retrieval[C]//European Conference on Computer Vision. Cham: Springer Nature Switzerland, 2022: 233-249.
    [83] Luyu G., Xueguang M., Jimmy L., and Jamie C. 2023. Precise Zero-Shot Dense Retrieval without Relevance Labels. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, pages 1762–1777, Toronto, Canada. Association for Computational Linguistics.
    [84] Wang L, Yang N, Wei F. Query2doc: Query expansion with large language models[J]. arXiv preprint arXiv: 2303.07678, 2023.
    [85] Asai A, Wu Z, Wang Y, Sil A, Hajishirzi H. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection[C]//The Twelfth International Conference on Learning Representations. 2023.
    [86] Jiang Z, Xu FF, Gao L, Sun Z, Liu Q, Dwivedi-Yu J, Yang Y, Callan J, Neubig G. Active Retrieval Augmented Generation[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023: 7969-7992.
    [87] Wang, H., Huang, W., Deng, Y., Wang, R., Wang, Z., Wang, Y., .. & Wong, K. F. (2024). UniMS-RAG: A Unified Multi-source Retrieval-Augmented Generation for Personalized Dialogue Systems. arXiv preprint arXiv: 2401.13256.
    [88] Zhiqiang Y., Wenkai Z., Changyuan T., Yongqiang M., Ruixue Z., Hongqi W., Kun F., Xian S., MCRN: A Multi-source Cross-modal Retrieval Network for remote sensing, International Journal of Applied Earth Observation and Geoinformation, Volume 115, 2022, 103071, ISSN 1569-8432, https://doi.org/10.1016/j.jag.2022.103071.
    [89] Bouchakwa, M., Ayadi, Y., Amous, I. Multi-level diversification approach of semantic-based image retrieval results. Prog Artif Intell 9, 1–30 (2020). https://doi.org/10.1007/s13748-019-00195-x.
    [90] Wen-Hui L., Song Y., Yan W., Dan S., Xuan-Ya L., Multi-level similarity learning for image-text retrieval, Information Processing & Management, Volume 58, Issue 1, 2021, 102432, ISSN 0306-4573, https://doi.org/10.1016/j.ipm.2020.102432.
    [91] Jeong S, Baek J, Cho S, et al. Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity[J]. arXiv preprint arXiv: 2403.14403, 2024.
    [92] Malkov YA, Yashunin DA. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE transactions on pattern analysis and machine intelligence, 2018, 42(4): 824−836
    [93] Li W, Li J, Ma W, Liu, Y. Citation-Enhanced Generation for LLM-based Chatbot. arXiv preprint arXiv: 2402.16063.
    [94] Michael G., Gaetano R., Md Faisal Mahbub C., Ankita N., Pengshan C., and Alfio G. 2022. Re2G: Retrieve, Rerank, Generate. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2701–2715, Seattle, United States. Association for Computational Linguistics.
    [95] Wang S, Xu X, Wang M, et al. RichRAG: Crafting Rich Responses for Multi-faceted Queries in Retrieval-Augmented Generation[J]. arXiv preprint arXiv: 2406.12566, 2024.
    [96] Xu Z, Liu Z, Liu Y, Xiong C, Yan Y, Wang S, Yu S, Liu Z, Yu G. ActiveRAG: Revealing the Treasures of Knowledge via Active Learning. arXiv preprint arXiv: 2402.13547. 2024 Feb 21.
    [97] Izacard Gautier, Grave Edouard. Chain of knowledge: Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv: 2007.01282, 2020.
    [98] Zhang J, Wang X, Zhang H, Sun H, Liu X. Retrieval-based neural source code summarization. Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020: 1385–1397.
    [99] Khandelwal U, Levy O, Jurafsky D, Zettlemoyer L, Lewis M. Generalization through memorization: Nearest neighbor language models. International Conference on Learning Representations, 2019.
    [100] Poesia Gabriel, Polozov Oleksandr, Le Vu, Tiwari Ashish, Soares Gustavo, Meek Christopher, Gulwani Sumit. Synchromesh: Reliable code generation from pre-trained language models. arXiv preprint arXiv: 2201.11227, 2022.
    [101] Joshi H, Sanchez J C, Gulwani S, et al. Repair is nearly generation: Multilingual program repair with LLM. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(4): 5131−5140 doi: 10.1609/aaai.v37i4.25642
    [102] Zheng L, Chiang WL, Sheng Y, Li T, Zhuang S, Wu Z, Zhuang Y, Li Z, Lin Z, Xing E, Gonzalez JE. Lmsys-chat-1m: A large-scale real-world llm conversation dataset[J]. arXiv preprint arXiv: 2309.11998, 2023.
    [103] Zhu Y, Ren C, Xie S, Liu S, Ji H, Wang Z, Sun T, He L, Li Z, Zhu X, Pan C. REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models[J]. arXiv preprint arXiv: 2402.07016, 2024.
    [104] Shi Y, Zi X, Shi Z, et al. ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization[J]. arXiv preprint arXiv: 2405.06683, 2024.
    [105] Jiang X, Zhang R, Xu Y, et al. Think and Retrieval: A Hypothesis Knowledge Graph Enhanced Medical Large Language Models. arXiv preprint arXiv: 2312.15883, 2023.
    [106] Wiratunga N, Abeyratne R, Jayawardena L, et al. CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLM for Legal Question Answering[J]. arXiv preprint arXiv: 2404.04302, 2024.
    [107] Salemi A, Zamani H. Towards a search engine for machines: Unified ranking for multiple retrieval-augmented large language models[C]//Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2024: 741-751.
    [108] Komeili M, Shuster K, Weston J. Internet-Augmented Dialogue Generation[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2022: 8460-8478.
    [109] Lin W, Byrne B. "Retrieval augmented visual question answering with outside knowledge, " in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 11 238–11 254.
    [110] Gui L, Wang B, Huang Q, Hauptmann A, Bisk Y, Gao J. "Kat: A knowledge augmented transformer for vision-and-language, " in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 956–968.
    [111] Tiong AM, Li J, Li B, Savarese S, Hoi SC. "Plug-and-play vqa: Zero-shot vqa by conjoining large pretrained models with zero training, " in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 951–967.
    [112] Chen W, Hu H, Chen X, Verga P, Cohen WW. "Murag: Multimodal retrieval-augmented generator for open question answering over images and text" in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 5558–5570.
    [113] Yasunaga M, Aghajanyan A, Shi W, James R, Leskovec J, Liang P, Lewis M, Zettlemoyer L, Yih WT. "Retrieval-augmented multimodal language modeling, " in ICML, vol. 202, 2023, pp. 39755–39769.
    [114] Chen W, Hu H, Saharia C, Cohen WW. "Re-imagen: Retrieval-augmented text-to-image generator, " in ICLR, 2023.
    [115] Long A, Yin W, Ajanthan T, Nguyen V, Purkait P, Garg R, Blair A, Shen C, van den Hengel A. "Retrieval augmented classification for long-tail visual recognition, " in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 6959–6969.
    [116] Zhao J, Haffar G, Shareghi E. "Generating synthetic speech from spokenvocab for speech translation" in Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 1930–1936.
    [117] Huang R, Huang J, Yang D, Ren Y, Liu L, Li M, Ye Z, Liu J, Yin X, Zhao Z. "Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models, " in ICML, vol. 202, 2023, pp. 13 916–13 932.
    [118] Yuan J, Sun S, Omeiza D, et al. RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model[J]. arXiv preprint arXiv: 2402.10828, 2024.
    [119] Hussien M M, Melo A N, Ballardini A L, et al. RAG-based Explainable Prediction of Road Users Behaviors for Automated Driving using Knowledge Graphs and Large Language Models[J]. arXiv preprint arXiv: 2405.00449, 2024.
    [120] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention is All you Need. In Advances in Neural Information Processing Systems, (NIPS 2017).
    [121] Chen M, Tworek J, Jun H, Yuan Q, Pinto HP, Kaplan J, Edwards H, Burda Y, Joseph N, Brockman G, Ray A. Evaluating large language models trained on code. arXiv preprint arXiv: 2107.03374. 2021 Jul 7.
    [122] Ziegler A, Kalliamvakou E, Li XA, Rice A, Rifkin D, Simister S, Sittampalam G, Aftandilian E. Productivity assessment of neural code completion. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming. 21–29.2022.
    [123] Li Y, Choi D, Chung J, Kushman N, Schrittwieser J, Leblond R, Eccles T, Keeling J, Gimeno F, Dal Lago A, Hubert T. Competition-Level Code Generation with AlphaCode. arXiv: 2203.07814 (2022).
    [124] Nijkamp E, Pang B, Hayashi H, Tu L, Wang H, Zhou Y, Savarese S, Xiong C. A conversational paradigm for program synthesis. arXiv: 2203.13474 (2022).
    [125] Fried D, Aghajanyan A, Lin J, Wang S, Wallace E, Shi F, Zhong R, Yih WT, Zettlemoyer L, Lewis M. Incoder: A generative model for code infilling and synthesis. arXiv: 2204.05999(2022).
    [126] Chowdhery A, Narang S, Devlin J, Bosma M, Mishra G, Roberts A, Barham P, Chung HW, Sutton C, Gehrmann S, Schuh P. Palm: Scaling language modeling with pathways. Journal of Machine Learning Research, 2023, 24(240): 1−13
    [127] Qinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Lei Shen, Zihan Wang, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang. Codegeex: A pre-trained model for code generation with multilingual benchmarking on humaneval-x[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2023: 5673-5684.
    [128] Selvaraju R R, Cogswell M, Das A, et al. Grad-cam: Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE international conference on computer vision. 2017: 618-626.
    [129] Yuan Z, Xi Q, Tan C, Zhao Z, Yuan H, Huang F, Huang S. "Ramm: Retrieval-augmented biomedical visual question answering with multi-modal pre-training, " arXiv preprint arXiv: 2303.00534, 2023.
    [130] Zhou Y, Long G. "Style-aware contrastive learning for multi-style image captioning, " in Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 2212–2222.
    [131] Shen S, Li C, Hu X, Xie Y, Yang J, Zhang P, Gan Z, Wang L, Yuan L, Liu C, Keutzer K. K-lite: Learning transferable visual models with external knowledge. Advances in Neural Information Processing Systems, 2022, 35: 15 558−15 573
    [132] WordNet: An electronic lexical database[M]. MIT press, 1998.
    [133] Zesch T, Müller C, Gurevych I. Using Wiktionary for Computing Semantic Relatedness[C]//AAAI. 2008, 8: 861-866.
    [134] Liu H, Son K, Yang J, Liu C, Gao J, Lee YJ, Li C. "Learning customized visual models with retrieval-augmented knowledge" in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 15 148–15 158.
    [135] Whitehead S, Ji H, Bansal M, Chang SF, Voss C. "Incorporating background knowledge into video description generation, " in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 3992–4001.
    [136] Le H, Chen N, Hoi S. "Vgnmn: Video-grounded neural module networks for video-grounded dialogue systems, " in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 3377–3393.
    [137] Kim M, Sung-Bin K, Oh TH. "Prefix tuning for automated audio captioning" in ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023, pp. 1–5.
    [138] Mestre R, Middleton SE, Ryan M, Gheasi M, Norman T, Zhu J. "Augmenting pre-trained language models with audio feature embedding for argumentation mining in political debates." in Findings of the Association for Computational Linguistics: EACL 2023, 2023, pp. 274–288.
    [139] Shu Y, Yu Z, Li Y, Karlsson BF, Ma T, Qu Y, Lin CY. "Tiara: Multi-grained retrieval for robust question answering over large knowledge base." in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 8108–8121.
    [140] Pan F, Canim M, Glass M, Gliozzo A, Fox P. "Cltr: An end-to-end, transformer-based system for cell-level table retrieval and table question answering." in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations, 2021, pp. 202–209.
    [141] Yang Z, Qin J, Chen J, Lin L, Liang X. "Logicsolver: Towards interpretable math word problem solving with logical prompt-enhanced learning" in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 1–13.
    [142] He H, Zhang H, Roth D. "Rethinking with retrieval: Faithful large language model inference." arXiv preprint arXiv: 2301.00303, 2022.
    [143] Li X, Zhao R, Chia YK, Ding B, Bing L, Joty S, Poria S. Chain of knowledge: A framework for grounding large language models with structured knowledge bases. arXiv preprint arXiv: 2305.13269, 2023.
    [144] Zhou H, Gu B, Zou X, et al. A survey of large language models in medicine: Progress, application, and challenge[J]. arXiv preprint arXiv: 2311.05112, 2023.
    [145] Kang B, Kim J, Yun T R, et al. Prompt-RAG: Pioneering Vector Embedding-Free Retrieval-Augmented Generation in Niche Domains, Exemplified by Korean Medicine[J]. arXiv preprint arXiv: 2401.11246, 2024.
    [146] Quidwai M A, Lagana A. A RAG Chatbot for Precision Medicine of Multiple Myeloma[J]. medRxiv, 2024: 2024.03. 14.24304293.
    [147] Kim J, Min M. From RAG to QA-RAG: Integrating Generative AI for Pharmaceutical Regulatory Compliance Process[J]. arXiv preprint arXiv: 2402.01717, 2024.
    [148] Rafat M I. AI-powered Legal Virtual Assistant: Utilizing RAG-optimized LLM for Housing Dispute Resolution in Finland[J]. 2024.
    [149] Li Y, Wang S, Ding H, et al. Large language models in finance: A survey[C]//Proceedings of the fourth ACM international conference on AI in finance. 2023: 374-382.
    [150] Ryu C, Lee S, Pang S, et al. Retrieval-based Evaluation for LLM: A Case Study in Korean Legal QA[C]//Proceedings of the Natural Legal Language Processing Workshop 2023. 2023: 132-137.
    [151] Cui C, Ma Y, Cao X, et al. A survey on multimodal large language models for autonomous driving[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024: 958-979.
    [152] Kim J, Rohrbach A, Darrell T, et al. Textual explanations for self-driving vehicles[C]//Proceedings of the European conference on computer vision (ECCV). 2018: 563-578.
    [153] Papineni K, Roukos S, Ward T, et al. Bleu: a method for automatic evaluation of machine translation[C]//Proceedings of the 40th annual meeting of the Association for Computational Linguistics. 2002: 311-318.
    [154] Elliott D, Keller F. Image description using visual dependency representations[C]//Proceedings of the 2013 conference on empirical methods in natural language processing. 2013: 1292-1302.
    [155] Vedantam R, Lawrence Zitnick C, Parikh D. Cider: Consensus-based image description evaluation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2015: 4566-4575.
    [156] Xu Z, Zhang Y, Xie E, et al. Drivegpt4: Interpretable end-to-end autonomous driving via large language model[J]. arXiv preprint arXiv: 2310.01412, 2023.
    [157] Bornea A L, Ayed F, De Domenico A, et al. Telco-RAG: Navigating the challenges of retrieval-augmented language models for telecommunications[J]. arXiv preprint arXiv: 2404.15939, 2024.
    [158] GADDALA V S. Unleashing the Power of Generative AI and RAG Agents in Supply Chain Management: A Futuristic Perspective[J]. 2023.
    [159] Gupta A, Shirgaonkar A, Balaguer A L, et al. RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture[J]. arXiv preprint arXiv: 2401.08406, 2024.
    [160] Fei-Yue W. Foundation worlds for parallel intelligence: From foundation/infrastructure models to foundation/infrastructure intelligence. Alfred North Whitehead Laureate Lectures, Beijing, 2021.
    [161] Packer C, Fang V, Patil S G, et al. Memgpt: Towards LLM as operating systems[J]. arXiv preprint arXiv: 2310.08560, 2023.
    [162] Pouplin T, Sun H, Holt S, et al. Retrieval-Augmented Thought Process as Sequential Decision Making[J]. arXiv preprint arXiv: 2402.07812, 2024.
    [163] Geva M, Khashabi D, Segal E, et al. Did aristotle use a laptop? a question answering benchmark with implicit reasoning strategies. Transactions of the Association for Computational Linguistics, 2021, 9: 346−361 doi: 10.1162/tacl_a_00370
    [164] Stelmakh I, Luan Y, Dhingra B, et al. ASQA: Factoid questions meet long-form answers[J]. arXiv preprint arXiv: 2204.06092, 2022.
    [165] Hayashi H, Budania P, Wang P, et al. WikiAsp: A dataset for multi-domain aspect-based summarization. Transactions of the Association for Computational Linguistics, 2021, 9: 211−225 doi: 10.1162/tacl_a_00362
    [166] Zhong M, Liu Y, Yin D, et al. Towards a unified multi-dimensional evaluator for text generation[J]. arXiv preprint arXiv: 2210.07197, 2022.
    [167] Yu H, Gan A, Zhang K, et al. Evaluation of Retrieval-Augmented Generation: A Survey[J]. arXiv preprint arXiv: 2405.07437, 2024.
    [168] Chen H, Xu F, Arora S, et al. Understanding retrieval augmentation for long-form question answering[J]. arXiv preprint arXiv: 2310.12150, 2023.
    [169] Yasunaga M, Aghajanyan A, Shi W, et al. Retrieval-augmented multimodal language modeling[J]. arXiv preprint arXiv: 2211.12561, 2022.
    [170] Nashid N, Sintaha M, Mesbah A. Retrieval-based prompt selection for code-related few-shot learning[C]//2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2023: 2450-2462.
    [171] Yu W. Retrieval-augmented generation across heterogeneous knowledge[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Student Research Workshop. 2022: 52-58.
  • 加载中
计量
  • 文章访问数:  12
  • HTML全文浏览量:  17
  • 被引次数: 0
出版历程
  • 收稿日期:  2024-03-29
  • 录用日期:  2024-10-05
  • 网络出版日期:  2025-04-17

目录

    /

    返回文章
    返回