An Idiomatic Metaphorical Word-Formation Method Based on a Large Language Model and Its Application: Knowledge Reorganization, Backtracking, and Discovery
Zhang Wei1,2, Wang Dongbo1,2, Liu Liu1,2
1.College of Information Management, Nanjing Agricultural University, Nanjing 210095 2.Research Center for Humanities and Social Computing, Nanjing Agricultural University, Nanjing 210095
张卫, 王东波, 刘浏. 基于大语言模型的成语隐喻式构词方法及其应用:知识重组、回溯与发现[J]. 情报学报, 2025, 44(9): 1083-1098.
Zhang Wei, Wang Dongbo, Liu Liu. An Idiomatic Metaphorical Word-Formation Method Based on a Large Language Model and Its Application: Knowledge Reorganization, Backtracking, and Discovery. 情报学报, 2025, 44(9): 1083-1098.
1 朱丽. 动态构词标引研究[J]. 情报学报, 1998, 17(3): 226-229. 2 张晓雨. 四字格网络新成语的构词方式探析[J]. 语文学刊, 2016(11): 75-77, 88. 3 Wu M M, Hu Y X, Zhang Y C, et al. Mitigating idiom inconsistency: a multi-semantic contrastive learning method for Chinese idiom reading comprehension[C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 19243-19251. 4 张卫, 王昊, 陈玥彤, 等. 融合迁移学习与文本增强的中文成语隐喻知识识别与关联研究[J]. 数据分析与知识发现, 2022, 6(S1): 167-183. 5 Lund B D, Wang T, Mannuru N R, et al. ChatGPT and a new academic reality: artificial Intelligence-written research papers and the ethics of the large language models in scholarly publishing[J]. Journal of the Association for Information Science and Technology, 2023, 74(5): 570-581. 6 井红静. 浅析汉语成语中的概念隐喻现象[J]. 今古文创, 2024(5): 122-124. 7 伦昕煜, 张雪. 梅兰竹菊类成语数量搭配与分形隐喻探析[J]. 语言与翻译, 2022(1): 49-54. 8 Wu T N. Metaphors and culturally unique idioms of eating and drinking in Mongolian[J]. Language and Cognition, 2023, 15(1): 173-214. 9 Di F F. The metaphorical interpretation of English and Chinese body-part idioms based on relevance theory[J]. Journal of Language Teaching and Research, 2021, 12(5): 837-843. 10 宗小飞, 吴世雄. 《诗经》隐喻性成语的历时语义演变[J]. 外国语言文学, 2010, 27(4): 236-240. 11 王雅琪. 汉英习语中太阳隐喻异同的认知文化语境阐释[J]. 外文研究, 2023, 11(3): 19-25, 105-106. 12 胡雪婵, 吴长安. 汉语成语语义韵的演变论略[J]. 汉语学习, 2016(5): 65-76. 13 Zhang W, Wang H, Song M, et al. A method of constructing a fine-grained sentiment lexicon for the humanities computing of classical Chinese poetry[J]. Neural Computing and Applications, 2023, 35(3): 2325-2346. 14 Williams L, Bannister C, Arribas-Ayllon M, et al. The role of idioms in sentiment analysis[J]. Expert Systems with Applications, 2015, 42(21): 7375-7385. 15 Abebe Fenta A. Vector representation of Amharic idioms for natural language processing applications using machine learning approach[J]. Machine Learning Research, 2023, 8(2): 17-22. 16 Dashtipour K, Gogate M, Gelbukh A, et al. Extending Persian sentiment lexicon with idiomatic expressions for sentiment analysis[J]. Social Network Analysis and Mining, 2022, 12: Article No.9. 17 Bruening B. Word formation is syntactic: adjectival passives in English[J]. Natural Language & Linguistic Theory, 2014, 32(2): 363-422. 18 刘璐, 亢世勇. 基于物性结构的无向型名词语义构词研究——以汉语同义类语素双音节合成词为例[J]. 中文信息学报, 2017, 31(4): 1-8, 19. 19 康司辰, 刘扬. 基于语义构词的汉语词语语义相似度计算[J]. 中文信息学报, 2017, 31(1): 94-101, 111. 20 郑婳, 刘扬, 殷雅琦, 等. 基于词信息嵌入的汉语构词结构识别研究[J]. 中文信息学报, 2022, 36(5): 31-40, 66. 21 Shang F J, Ran C F. An entity recognition model based on deep learning fusion of text feature[J]. Information Processing & Management, 2022, 59(2): 102841. 22 陈正瑜. 汉语叙词构词法的研究[J]. 情报理论与实践, 1996, 19(5): 16-19. 23 张卫, 王昊, 邓三鸿, 等. 面向数字人文的古诗文本情感术语抽取与应用研究[J]. 中国图书馆学报, 2021, 47(4): 113-131. 24 翟姗姗, 余华娟, 陈健瑶, 等. 基于多维特征分析的戏曲类方志文献命名实体识别研究[J]. 情报学报, 2024, 43(9): 1094-1104. 25 刘清民, 王芳, 黄梅银. 我国人工智能政策新词发现与演化研究——一个多特征融合的算法[J]. 现代情报, 2024, 44(6): 18-32, 58. 26 王烨. 汉语列举式并列组合结构的界定及语法单位归属——兼谈与成语、惯用语之关系[J]. 汉字文化, 2021(1): 44-48, 53. 27 李羽涵. 反义共现成语的内部语义结构研究[J]. 外文研究, 2024, 12(2): 19-26, 105-106. 28 张宇轩. 网络缩略语构词探析[J]. 今古文创, 2022(1): 123-125. 29 Noh H, Jo Y, Lee S. Keyword selection and processing strategy for applying text mining to patent analysis[J]. Expert Systems with Applications, 2015, 42(9): 4348-4360. 30 Rafiei-Asl J, Nickabadi A. TSAKE: a topical and structural automatic keyphrase extractor[J]. Applied Soft Computing, 2017, 58: 620-630. 31 俞琰, 王丽, 郑斯煜. 融入术语与层级信息的专利关键短语抽取方法研究[J]. 数据分析与知识发现, 2023, 7(6): 99-112. 32 Liang X N, Wu S Z, Li M, et al. Unsupervised keyphrase extraction by jointly modeling local and global context[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association of Computational Linguistics, 2021: 155-164. 33 Zhang C Z, Wang H L, Liu Y, et al. Automatic keyword extraction from documents using conditional random fields[J]. Journal of Computational Information Systems, 2008, 4(3): 1169-1180. 34 Xie B B, Song J, Shao L Y, et al. From statistical methods to deep learning, automatic keyphrase prediction: a survey[J]. Information Processing & Management, 2023, 60(4): 103382. 35 Song M Y, Jing L P, Xiao L. Importance estimation from multiple perspectives for keyphrase extraction[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association of Computational Linguistics, 2021: 2726-2736. 36 Yan X Y, Zhang Y Y, Zhang C Z. Utilizing cognitive signals generated during human reading to enhance keyphrase extraction from microblogs[J]. Information Processing & Management, 2024, 61(2): 103614. 37 周树斌, 高劲松, 张强, 等. 文化基因视域下诗词资源多维知识重组与可视化研究——以茶文化为例[J]. 图书情报工作, 2023, 67(16): 111-123. 38 王彦莹, 王昊, 朱惠, 等. 基于文本生成技术的历史古籍事件识别模型构建研究[J]. 图书情报工作, 2023, 67(3): 119-130. 39 Lu Y J, Lin H Y, Xu J, et al. Text2Event: controllable sequence-to-structure generation for end-to-end event extraction[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: Association of Computational Linguistics, 2021: 2795-2806. 40 Li Z, Cai J, He S, et al. Seq2seq dependency parsing[C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg: Association of Computational Linguistics, 2018: 3203-3214. 41 Hu E J, Shen Y L,Wallis P, et al. LoRA: low-rank adaptation of large language models[C/OL]// Proceedings of theInternational Conference on Learning Representations.Appleton: ICLR, 2022. https://iclr.cc/virtual/2022/poster/6319. 42 Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association of Computational Linguistics, 2002: 311-318. 43 Lin C Y. ROUGE: a package for automatic evaluation of summaries[C]// Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. Stroudsburg: Association of Computational Linguistics, 2004: 74-81. 44 Zhang T Y, Kishore V, Wu F, et al. BERTScore: evaluating text generation with BERT[C/OL]// Proceedings of theInternational Conference on Learning Representations. Appleton: ICLR,2020. https://iclr.cc/virtual_2020/poster_SkeHuCVFDr.html. 45 Che W, Li Z, Liu T. LTP: a Chinese language technology platform[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Stroudsburg: Association of Computational Linguistics, 2010: 13-16. 责任编辑 王克平)