An Exploration of the Novelty Measurement Task of Scientific Literature Driven by a Large Language Model
Zhang Lin1,2,3, Li Sijia1,2, Shi Shunshun1,2, Gou Zhenyu1,2, Huang Ying1,2,3
1.School of Information Management, Wuhan University, Wuhan 430072 2.Center for Science, Technology & Education Assessment (CSTEA), Wuhan University, Wuhan 430072 3.Centre for R&D Monitoring (ECOOM) and Department of MSI, KU Leuven, Leuven B- 3000
张琳, 李思佳, 施顺顺, 苟震宇, 黄颖. 大模型驱动的科技论文新颖性测度探索[J]. 情报学报, 2025, 44(9): 1099-1113.
Zhang Lin, Li Sijia, Shi Shunshun, Gou Zhenyu, Huang Ying. An Exploration of the Novelty Measurement Task of Scientific Literature Driven by a Large Language Model. 情报学报, 2025, 44(9): 1099-1113.
1 陆伟, 刘寅鹏, 石湘, 等. 大模型驱动的学术文本挖掘——推理端指令策略构建及能力评测[J]. 情报学报, 2024, 43(8): 946-959. 2 梁福军. 英文科技论文规范写作与编辑[M]. 北京: 清华大学出版社, 2014. 3 Liang W X, Zhang Y H, Cao H C, et al. Can large language models provide useful feedback on research papers? A large-scale empirical analysis[OL]. (2023-10-03). https://arxiv.org/pdf/2310.01783. 4 王雅琪, 曹树金. ChatGPT用于论文创新性评价的效果及可行性分析[J]. 情报资料工作, 2023, 44(5): 28-38. 5 唐晓波, 朱婧, 杜鑫. 基于知识元语义组合差异的专利新颖性细粒度测度方法——以工业机器人领域为例[J]. 情报理论与实践, 2023, 46(11): 154-163, 195. 6 沈雪莹, 欧石燕. 科学文献知识单元抽取及应用研究: 梳理与展望[J]. 情报理论与实践, 2022, 45(12): 195-207. 7 陆伟, 王玉琦, 罗卓然, 等. 基于双层时序网络的学术论文创新度量研究[J]. 复杂科学管理, 2023(2): 15-32. 8 安欣, 徐硕, 叶书路, 等. 面向全文本的微观实体抽取及扩散研究[J]. 图书馆论坛, 2021, 41(3): 42-49. 9 章成志, 谢雨欣, 张恒. 学术文献全文内容中的方法实体细粒度抽取及演化分析研究[J]. 情报学报, 2023, 42(8): 952-966. 10 章成志, 谢雨欣, 宋云天. 学术文本中细粒度知识实体的关联分析[J]. 图书馆论坛, 2021, 41(3): 12-20. 11 李贺, 杜杏叶. 基于知识元的学术论文内容创新性智能化评价研究[J]. 图书情报工作, 2020, 64(1): 93-104. 12 Wang Z Y, Shen X Y, Huang R, et al. Extracting method knowledge elements from scientific literature: a rule-based approach[J]. Proceedings of the Association for Information Science and Technology, 2019, 56(1): 805-807. 13 曹树金, 曹茹烨. 情报学论文创新性评价研究——LDA和SVM融合方法的应用[J]. 图书情报知识, 2022, 39(4): 56-67. 14 Duck G, Kovacevic A, Robertson D L, et al. Ambiguity and variability of database and software names in bioinformatics[J]. Journal of Biomedical Semantics, 2015, 6: 29. 15 Lin L, Wang D, Shen S. Extraction of thesis research conclusion sentences in academic literature[C]// Proceedings of the 2nd Workshop on Extraction and Evaluation of Knowledge Entities from Scientific Documents. Aachen: CEUR-WS.org, 2021: 74-76. 16 Mesbah S, Lofi C, Torre M V, et al. TSE-NER: an iterative approach for long-tail entity extraction in scientific publications[C]// Proceedings of the 17th International Semantic Web Conference. Cham: Springer, 2018: 127-143. 17 陆伟, 刘家伟, 马永强, 等. ChatGPT为代表的大模型对信息资源管理的影响[J]. 图书情报知识, 2023, 40(2): 6-9, 70. 18 车万翔, 窦志成, 冯岩松, 等. 大模型时代的自然语言处理: 挑战、机遇与发展[J]. 中国科学: 信息科学, 2023, 53(9): 1645-1687. 19 Bornmann L, Wu L F, Ettl C. The use of ChatGPT for identifying disruptive papers in science: a first exploration[J]. Scientometrics, 2024, 129(11): 7161-7165. 20 Nishikawa K, Koshiba H. Exploring the applicability of large language models to citation context analysis[J]. Scientometrics, 2024, 129(11): 6751-6777. 21 Cui W T, Xiao M, Wang L D, et al. Automated taxonomy alignment via large language models: bridging the gap between knowledge domains[J]. Scientometrics, 2024, 129(9): 5287-5312. 22 洪贇, 叶鹰, 佟彤. 国内外大语言模型的图书情报应用探讨[J]. 图书馆理论与实践, 2024(2): 72-80. 23 陈建青. 对我国学术论文创新性评审的几点思考[J]. 青年记者, 2013(18): 33-35. 24 侯剑华, 王东毅. 基于SAO-ADV模型的学术论文创新性的测度方法研究[J]. 情报理论与实践, 2020, 43(11): 129-136. 25 Kaufer D S, Geisler C. Novelty in academic writing[J]. Written Communication, 1989, 6(3): 286-311. 26 周露阳. 论审评学术论文创新因素的指标体系[J]. 编辑学报, 2006, 18(1): 68-70. 27 Lee Y N, Walsh J P, Wang J. Creativity in scientific teams: Unpacking novelty and impact[J]. Research Policy, 2015, 44(3): 684-697. 28 李晶, 杨雪, 苏秋丹, 等. 基于知识单元理论的科技成果创新性测度研究述评[J]. 现代情报, 2023, 43(8): 161-177. 29 黄迪汉. 浅谈科技论文的新颖性和科学性[M]// 科技期刊编辑研究文集(第三集). 成都: 四川科学技术出版社, 1994: 103-105. 30 魏绪秋, 申力旭. 学术论文创新性研究述评[J]. 图书情报知识, 2022, 39(4): 68-79. 31 Mishra S, Torvik V I. Quantifying conceptual novelty in the biomedical literature[J]. D-Lib Magazine, 2016, 22(9/10). DOI: 10.1045/september2016-mishra. 32 Arthur W B. The nature of technology: what it is and how it evolves[M]. New York: Simon and Schuster, 2009. 33 Boudreau K J, Guinan E C, Lakhani K R, et al. Looking across and looking beyond the knowledge frontier: intellectual distance, novelty, and resource allocation in science[J]. Management Science, 2016, 62(10): 2765-2783. 34 Uzzi B, Mukherjee S, Stringer M, et al. Atypical combinations and scientific impact[J]. Science, 2013, 342(6157): 468-472. 35 Matsumoto K, Shibayama S, Kang B, et al. Introducing a novelty indicator for scientific research: validating the knowledge-based combinatorial approach[J]. Scientometrics, 2021, 126(8): 6891-6915. 36 Wang J, Veugelers R, Stephan P. Bias against novelty in science: a cautionary tale for users of bibliometric indicators[J]. Research Policy, 2017, 46(8): 1416-1436. 37 Chen C H, Mayanglambam S D, Hsu F Y, et al. Novelty paper recommendation using citation authority diffusion[C]// Proceedings of the 16th International Conference on Technologies and Applications of Artificial Intelligence. Piscataway: IEEE, 2011: 126-131. 38 Tahamtan I, Bornmann L. Creativity in science and the link to cited references: is the creative potential of papers reflected in their cited references?[J]. Journal of Informetrics, 2018, 12(3): 906-930. 39 Tahamtan I, Bornmann L. Core elements in the process of citing publications: conceptual overview of the literature[J]. Journal of Informetrics, 2018, 12(1): 203-216. 40 朱大明. 参考文献的主要作用与学术论文的创新性评审[J]. 编辑学报, 2004, 16(2): 91-92. 41 索传军, 赖海媚. 学术论文问题知识元的类型与描述规则[J]. 中国图书馆学报, 2021, 47(2): 95-109. 42 李姗, 单磊, 崔雷. 不同被引频次论文主题词组合特征及其与论文新颖性关系的研究——以免疫学ESI指标为例[J]. 情报理论与实践, 2021, 44(1): 162-167. 43 Jeon D, Lee J, Ahn J M, et al. Measuring the novelty of scientific publications: a fastText and local outlier factor approach[J]. Journal of Informetrics, 2023, 17(4): 101450. 44 逯万辉, 谭宗颖. 学术成果主题新颖性测度方法研究——基于Doc2Vec和HMM算法[J]. 数据分析与知识发现, 2018, 2(3): 22-29. 45 杨建林, 钱玲飞. 基于关键词对逆文档频率的主题新颖度度量方法[J]. 情报理论与实践, 2013, 36(3): 99-102. 46 Amplayo R K, Hong S L, Song M. Network-based approach to detect novelty of scholarly literature[J]. Information Sciences, 2018, 422: 542-557. 47 Luo Z R, Lu W, He J G, et al. Combination of research questions and methods: a new measurement of scientific novelty[J]. Journal of Informetrics, 2022, 16(2): 101282. 48 罗卓然, 陆伟, 蔡乐, 等. 学术文本词汇功能识别——在论文新颖性度量上的应用[J]. 情报学报, 2022, 41(7): 720-732. 49 钱佳佳, 罗卓然, 陆伟. 基于问题-方法组合的科技论文新颖性度量与创新类型识别[J]. 图书情报工作, 2021, 65(14): 82-89. 50 戎军涛, 索传军, 周彦廷, 等. 基于创新知识元谱系的学术论文新颖性测度研究[J]. 图书情报工作, 2024, 68(1): 27-38. 51 张颖怡, 章成志, 周毅, 等. 基于ChatGPT的多视角学术论文实体识别: 性能测评与可用性研究[J]. 数据分析与知识发现, 2023, 7(9): 12-24. 52 时宗彬, 朱丽雅, 乐小虬. 基于本地大语言模型和提示工程的材料信息抽取方法研究[J]. 数据分析与知识发现, 2024, 8(7): 23-31. 53 黄俊涛. 科技领域知识图谱构建技术研究[D]. 北京: 北方工业大学, 2024. 54 王喆. 深度学习推荐系统[M]. 北京: 电子工业出版社, 2020. 55 汪雪锋, 于慧妍, 郑思佳, 等. 学术论文创新质量评价研究——以多能干细胞技术为例[J]. 数据分析与知识发现, 2024, 8(5): 127-138. 56 詹媛. 我国科技期刊学术影响力逐年上升[N]. 光明日报, 2024-12-20(8). 57 Li Y D, Zhang Y Q, Zhao Z, et al. CSL: a large-scale Chinese scientific literature dataset[C]// Proceedings of the 29th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2022: 3917-2923. 58 Yang A, Yang B S, Hui B Y, et al. Qwen2 technical report[OL]. (2024-09-10). https://arxiv.org/pdf/2407.10671. 59 Bai J Z, Bai S, Chu Y F, et al. Qwen technical report[OL]. (2023-09-28). https://arxiv.org/pdf/2309.16609. 60 张吉玉, 张均胜. 考虑时序的单篇科技文献新颖性评估方法[J]. 图书情报工作, 2022, 66(17): 93-105. 61 逯万辉, 苏金燕, 余倩. 学术成果主题新颖性与学术引用的相关关系研究[J]. 情报资料工作, 2018, 39(6): 68-73.