|
|
Measuring the Novelty of Scientific Papers Using Cross-dimensional Feature Integration |
Ma Ming1, Zheng Zhejun2, Mao Jin3,4, Bai Yun3, Li Gang4 |
1.Research Institute for Data Management & Innovation, Nanjing University, Suzhou 215163 2.School of Information Management, Nanjing University, Nanjing 210023 3.School of Information Management, Wuhan University, Wuhan 430072 4.Center for Studies of Information Resources, Wuhan University, Wuhan 430072 |
|
|
Abstract Accurately assessing the intrinsic novelty of scientific papers is essential for advancing academic research and ensuring high-quality evaluation of scientific innovations. This study introduces a method for measuring the novelty of scientific papers that integrates cross-dimensional features based on their knowledge content and academic communication structure, enhancing the precision of academic assessments. First, a structured knowledge representation model for scientific papers is constructed using a specific combination of questions and methods, with a domain-pretrained language model employed to weigh these combinations. Second, from the perspectives of knowledge content and academic communication structure, we construct a cross-dimensional comprehensive measurement index to evaluate the novelty of scientific papers, focusing on ex ante features such as originality, complexity, and research popularity. The effectiveness of the proposed method is validated through empirical analysis of a biomedical dataset and ex post impact verification of novel papers. The empirical analysis results demonstrate that the proposed method is resilient to time and environmental factors, maintaining its effectiveness over long-term spans and successfully uncovering novelty patterns of papers in a specific field. Furthermore, comparisons with single-dimensional methods show that the proposed method synthesizes and captures multidimensional composite features more effectively, avoiding the oversimplification and one-sidedness of measurement dimensions. This study introduces new perspectives and methodologies for measuring the novelty of scientific papers and offers researchers a valuable tool for identifying and advancing innovative research.
|
Received: 28 May 2024
|
|
|
|
1 Harvey S, Berry J W. Toward a meta-theory of creativity forms: how novelty and usefulness shape creativity[J]. Academy of Management Review, 2023, 48(3): 504-529. 2 George J M. Creativity in organizations[J]. The Academy of Management Annals, 2007, 1(1): 439-477. 3 George J M, Zhou J. When openness to experience and conscientiousness are related to creative behavior: an interactional approach[J]. Journal of Applied Psychology, 2001, 86(3): 513-524. 4 杨锋, 梁棵, 苟清龙, 等. 同行评议制度缺陷的根源及完善机制[J]. 科学学研究, 2008, 26(3): 569-572. 5 杨京, 王芳, 白如江. 一种基于研究主题对比的单篇学术论文创新力评价方法[J]. 图书情报工作, 2018, 62(17): 75-83. 6 Luo Z R, Lu W, He J G, et al. Combination of research questions and methods: a new measurement of scientific novelty[J]. Journal of Informetrics, 2022, 16(2): 101282. 7 Uzzi B, Mukherjee S, Stringer M, et al. Atypical combinations and scientific impact[J]. Science, 2013, 342(6157): 468-472. 8 Wang J, Veugelers R, Stephan P. Bias against novelty in science: a cautionary tale for users of bibliometric indicators[J]. Research Policy, 2017, 46(8): 1416-1436. 9 Fleming L. Recombinant uncertainty in technological search[J]. Management Science, 2001, 47(1): 117-132. 10 Min C, Bu Y, Sun J J. Predicting scientific breakthroughs based on knowledge structure variations[J]. Technological Forecasting and Social Change, 2021, 164: 120502. 11 Verhoeven D, Bakker J, Veugelers R. Measuring technological novelty with patent-based indicators[J]. Research Policy, 2016, 45(3): 707-723. 12 《科技查新教程》编写组. 科技查新教程[M]. 北京: 机械工业出版社, 2001. 13 Luzón Marco M J. The construction of novelty in computer science papers[J]. Revista Alicantina de Estudios Ingleses, 2000(13): 123-140. 14 Shibayama S, Yin D Y, Matsumoto K. Measuring novelty in science with word embedding[J]. PLoS One, 2021, 16(7): e0254034. 15 Campbell D T. Blind variation and selective retentions in creative thought as in other knowledge processes[J]. Psychological Review, 1960, 67(6): 380-400. 16 Verhoeven D, Bakker J, Veugelers R. Identifying ex ante characteristics of radical inventions through patent-based indicators[J]. SSRN Electronic Journal, 2013: lirias1830808. 17 Banerjee P M, Cole B M. Globally radical technologies and locally radical technologies: the role of audiences in the construction of innovative impact in biotechnology[J]. IEEE Transactions on Engineering Management, 2011, 58(2): 262-274. 18 Henderson R M, Clark K B. Architectural innovation: the reconfiguration of existing product technologies and the failure of established firms[J]. Administrative Science Quarterly, 1990, 35(1): 9-30. 19 Dosi G. Technological paradigms and technological trajectories[J]. Research Policy, 1982, 11(3): 147-162. 20 Ahuja G, Lampert C M. Entrepreneurship in the large corporation: a longitudinal study of how established firms create breakthrough inventions[J]. Strategic Management Journal, 2001, 22(6/7): 521-543. 21 逯万辉, 苏金燕, 余倩. 学术成果主题新颖性与学术引用的相关关系研究[J]. 情报资料工作, 2018, 39(6): 68-73. 22 Leydesdorff L, Bornmann L, Comins J A, et al. Citations: indicators of quality? the impact fallacy[J]. Frontiers in Research Metrics and Analytics, 2016, 1: 1. 23 Koestler A. The act of creation[M]. London: Macmillan, 1964: 751. 24 Harvey S. Creative synthesis: exploring the process of extraordinary group creativity[J]. Academy of Management Review, 2014, 39(3): 324-343. 25 Hargadon A B, Bechky B A. When collections of creatives become creative collectives: a field study of problem solving at work[J]. Organization Science, 2006, 17(4): 484-500. 26 Azoulay P, Graff Zivin J S, Manso G. Incentives and creativity: evidence from the academic life sciences[J]. The RAND Journal of Economics, 2011, 42(3): 527-554. 27 罗卓然, 陆伟, 蔡乐, 等. 学术文本词汇功能识别——在论文新颖性度量上的应用[J]. 情报学报, 2022, 41(7): 720-732. 28 Kyebambe M N, Cheng G, Huang Y Q, et al. Forecasting emerging technologies: a supervised learning approach through patent analysis[J]. Technological Forecasting and Social Change, 2017, 125: 236-244. 29 Landauer T K, Dumais S T. A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge[J]. Psychological Review, 1997, 104(2): 211-240. 30 Strübing J. Research as pragmatic problem-solving: the pragmatist roots of empirically-grounded theorizing[M]// The SAGE Handbook of Grounded Theory. Thousand Oaks: SAGE Publications, 2007: 580-602. 31 Mann W C, Thompson S A. Rhetorical Structure Theory: Toward a functional theory of text organization[J]. Text-Interdisciplinary Journal for the Study of Discourse, 1988, 8(3): 243-281. 32 Teufel S. Argumentative zoning: information extraction from scientific text[D]. Edinburgh: The University of Edinburgh, 1999. 33 McShea D W, Wang S C, Brandon R N. A quantitative formulation of biology’s first law[J]. Evolution, 2019, 73(6): 1101-1115. 34 Arts S, Hou J N, Gomez J C. Natural language processing to identify the creation and impact of new technologies in patent text: Code, data, and new measures[J]. Research Policy, 2021, 50(2): 104144. 35 Ericsson K A, Charness N. Cognitive and developmental factors in expert performance[M]// Expertise in Context: Human and Machine. Cambridge: The MIT Press, 1997: 41. 36 Carmel D, Roitman H, Yom-Tov E. On the relationship between novelty and popularity of user-generated content[J]. ACM Transactions on Intelligent Systems and Technology, 2012, 3(4): Article No.69. 37 钱佳佳, 罗卓然, 陆伟. 基于问题-方法组合的科技论文新颖性度量与创新类型识别[J]. 图书情报工作, 2021, 65(14): 82-89. 38 Piwowar H. Value all research products[J]. Nature, 2013, 493(7431): 159. 39 Davis A P, Grondin C J, Lennon-Hopkins K, et al. The comparative toxicogenomics database’s 10th year anniversary: update 2015[J]. Nucleic Acids Research, 2015, 43: D914-D920. 40 Davis A P, Wiegers T C, Wiegers J, et al. CTD tetramers: a new online tool that computationally links curated chemicals, genes, phenotypes, and diseases to inform molecular mechanisms for environmental health[J]. Toxicological Sciences, 2023, 195(2): 155-168. 41 Bateman T S, Hess A M. Different personal propensities among scientists relate to deeper vs. broader knowledge contributions[J]. Proceedings of the National Academy of Sciences of the United States of America, 2015, 112(12): 3653-3658. 42 Tacke E, Kupferschmied C, Lang D. Hypertrophe kardiomyopathie unter ACTH-behandlung[J]. Klinische P?diatrie, 1983, 195(2): 124-128. 43 Chariot P, Bonne G, Authier F J, et al. Expression of cytochrome c oxidase subunits encoded by mitochondrial or nuclear DNA in the muscle of patients with zidovudine myopathy[J]. Journal of the Neurological Sciences, 1994, 125(2): 190-193. 44 Yamauchi H, Aminaka Y, Yoshida K, et al. Evaluation of DNA damage in patients with arsenic poisoning: urinary 8-hydroxydeoxyguanine[J]. Toxicology and Applied Pharmacology, 2004, 198(3): 291-296. 45 Wong C C, Martincorena I, Rust A G, et al. Inactivating CUX1 mutations promote tumorigenesis[J]. Nature Genetics, 2014, 46(1): 33-38. 46 Hou J H, Ye J T. Are uncited papers necessarily all nonimpact papers? A quantitative analysis[J]. Scientometrics, 2020, 124(2): 1631-1662. 47 汪雪锋, 于慧妍, 郑思佳, 等. 学术论文创新质量评价研究——以多能干细胞技术为例[J]. 数据分析与知识发现, 2024, 8(5): 127-138. 48 H1 Connect[DS/OL]. [2024-07-10]. https://archive.connect.h1.co/search. 49 Bornmann L. Interrater reliability and convergent validity of F1000Prime peer review[J]. Journal of the Association for Information Science and Technology, 2015, 66(12): 2415-2426. 50 Praus P. Analysis of journal rankings confirms that more cited articles contain more references[J]. Scientometrics, 2024, 129(11): 7153-7160. 51 Golosovsky M, Larivière V. Uncited papers are not useless[J]. Quantitative Science Studies, 2021, 2(3): 899-911. |
|
|
|