|
|
|
| An Approach to Identifying Transformative Research by Integrating Citation Function and Triangular Citation Structure |
| Zheng Zhejun1,2, Ma Yaxue1,2, Liang Zhentao3, Bai Yun3, Pei Lei1,2 |
1.Laboratory of Data Intelligence and Interdisciplinary Innovation, Nanjing University, Nanjing 210023 2.School of Information Management, Nanjing University, Nanjing 210023 3.School of Information Management, Wuhan University, Wuhan 430072 |
|
|
|
|
Abstract Transformative research (TR) serves as a precursor to the emergence of new paradigms or disciplines in science and technology. Identifying TR is important for R&D management and technological forecasting. In response to the gaps in considering how citations of different functions impact the evaluation of focal papers, this study proposed a novel approach to identifying TR by integrating the citation function and triangular citation structure. By obtaining the triangular citation structure among the focal paper, its predecessor, and its successor based on different combinations of citation functions, we extracted the relationships of consolidation or disruption between the papers. An egocentric consolidation-disruption citation network (ECCD) was constructed for each focal paper. The ECCD network structure and text input were employed to build a heterogeneous graph attention neural network model, which was used to identify TR that possessed both high academic impact and peer-reviewed disruptive papers. An empirical analysis of the PubMed Central dataset Open Access subset revealed that the TR identification task achieved an optimal F1-score of 0.3926, which exceeded that of other baseline models. Further parameter analysis showed that disruptive citations interpreted by the citation function played a critical role in identifying high-impact and transformative research.
|
|
Received: 23 October 2024
|
|
|
|
1 National Science Board. Enhancing support of transformative research at the National Science Foundation[R]. Alexandria: National Science Foundation, 2007. 2 林紫洛, 杨雪梅, 于诗睿, 等. 摘要语言视角下医学突破性论文识别研究[J]. 医学信息学杂志, 2023, 44(5): 39-44. 3 Savov P, Jatowt A, Nielek R. Identifying breakthrough scientific papers[J]. Information Processing & Management, 2020, 57(2): 102168. 4 Wang S Y, Ma Y X, Mao J, et al. Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities[J]. Journal of the Association for Information Science and Technology, 2023, 74(2): 150-167. 5 Wei C L, Li J, Shi D B. Quantifying revolutionary discoveries: evidence from Nobel prize-winning papers[J]. Information Processing & Management, 2023, 60(3): 103252. 6 Staudt J, Yu H F, Light R P, et al. High-impact and transformative science (HITS) metrics: definition, exemplification, and comparison[J]. PLoS One, 2018, 13(7): e0200597. 7 Wu L F, Wang D S, Evans J A. Large teams develop and small teams disrupt science and technology[J]. Nature, 2019, 566(7744): 378-382. 8 Funk R J, Owen-Smith J. A dynamic network measure of technological change[J]. Management Science, 2017, 63(3): 791-817. 9 Leibel C, Bornmann L. What do we know about the disruption index in scientometrics? An overview of the literature[J]. Scientometrics, 2024, 129(1): 601-639. 10 Leydesdorff L, Bornmann L. Disruption indices and their calculation using web-of-science data: indicators of historical developments or evolutionary dynamics?[J]. Journal of Informetrics, 2021, 15(4): 101219. 11 Bonzi S, Snyder H W. Motivations for citation: a comparison of self citation and citation to others[J]. Scientometrics, 1991, 21(2): 245-254. 12 Thelwall M. Should citations be counted separately from each originating section?[J]. Journal of Informetrics, 2019, 13(2): 658-678. 13 Bornmann L, Devarakonda S, Tekles A, et al. Are disruption index indicators convergently valid? The comparison of several indicator variants with assessments by peers[J]. Quantitative Science Studies, 2020, 1(3): 1242-1259. 14 刘运梅, 马费成. 面向全文本内容分析的文献三角引用现象研究[J]. 中国图书馆学报, 2021, 47(3): 84-99. 15 Chai S, Menon A. Breakthrough recognition: bias against novelty and competition for attention[J]. Research Policy, 2019, 48(3): 733-747. 16 梁国强, 步一, 胡志刚, 等. 变革性研究预见: 理论模型和多维引文特征[J]. 情报学报, 2022, 41(11): 1111-1123. 17 Small H, Tseng H, Patek M. Discovering discoveries: identifying biomedical discoveries using citation contexts[J]. Journal of Informetrics, 2017, 11(1): 46-62. 18 Wang X, Yang X M, Du J, et al. A deep learning approach for identifying biomedical breakthrough discoveries using context analysis[J]. Scientometrics, 2021, 126(7): 5531-5549. 19 王雪, 杨雪梅, 林紫洛, 等. 基于引文全文本的医学领域突破性文献识别研究[J]. 情报杂志, 2021, 40(3): 132-138. 20 Mariani M S, Medo M, Zhang Y C. Identification of milestone papers through time-balanced network centrality[J]. Journal of Informetrics, 2016, 10(4): 1207-1223. 21 Min C, Bu Y, Sun J J. Predicting scientific breakthroughs based on knowledge structure variations[J]. Technological Forecasting and Social Change, 2021, 164: 120502. 22 梁国强, 侯海燕, 黄福, 等. “革命性论文”学科扩散的早期识别及影响因素分析: 以诺奖论文为例[J]. 情报杂志, 2019, 38(9): 142-149. 23 杨雪梅, 汪雪锋, 唐小利, 等. 生物医学领域突破性论文识别研究[J]. 图书情报工作, 2024, 68(15): 4-14. 24 梁国强, 宋卢睿, 侯海燕. 引文视角下的变革性研究早期识别模型构建方法与应用[J]. 现代情报, 2024, 44(6): 59-66, 81. 25 Hu Z G, Chen C M, Liu Z Y. Where are citations located in the body of scientific articles? A study of the distributions of citation locations[J]. Journal of Informetrics, 2013, 7(4): 887-896. 26 Mari?i? S, Spaventi J, Pavi?i? L, et al. Citation context versus the frequency counts of citation histories[J]. Journal of the American Society for Information Science, 1998, 49(6): 530-540. 27 Cano V. Citation behavior: classification, utility, and location[J]. Journal of the American Society for Information Science, 1989, 40(4): 284-290. 28 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985. 29 秦成磊, 章成志. 基于层次注意力网络模型的学术文本结构功能识别[J]. 数据分析与知识发现, 2020, 4(11): 26-42. 30 Wu J G. Improving the writing of research papers: IMRAD and beyond[J]. Landscape Ecology, 2011, 26(10): 1345-1349. 31 Sollaci L B, Pereira M G. The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey[J]. Journal of the Medical Library Association, 2004, 92(3): 364-367. 32 Qin C L, Zhang C Z. Which structure of academic articles do referees pay more attention to?: perspective of peer review and full-text of academic articles[J]. Aslib Journal of Information Management, 2023, 75(5): 884-916. 33 Huang S Z, Qian J J, Huang Y, et al. Disclosing the relationship between citation structure and future impact of a publication[J]. Journal of the Association for Information Science and Technology, 2022, 73(7): 1025-1042. 34 Boyack K W, Klavans R. Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately?[J]. Journal of the American Society for Information Science and Technology, 2010, 61(12): 2389-2404. 35 Garfield E. Historiographic mapping of knowledge domains literature[J]. Journal of Information Science, 2004, 30(2): 119-145. 36 Kessler M M. Bibliographic coupling between scientific papers[J]. American Documentation, 1963, 14(1): 10-25. 37 Small H, Griffith B C. The structure of scientific literatures I: identifying and graphing specialties[J]. Science Studies, 1974, 4(1): 17-40. 38 Zhang X Y, Xie Q, Song C, et al. Mining the evolutionary process of knowledge through multiple relationships between keywords[J]. Scientometrics, 2022, 127(4): 2023-2053. 39 White H D, Griffith B C. Author cocitation: a literature measure of intellectual structure[J]. Journal of the American Society for Information Science, 1981, 32(3): 163-171. 40 McCain K W. Mapping authors in intellectual space: a technical overview[J]. Journal of the American Society for Information Science, 1990, 41(6): 433-443. 41 Zhao D Z, Strotmann A. Evolution of research activities and intellectual influences in information science 1996–2005: introducing author bibliographic-coupling analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(13): 2070-2086. 42 Zhao D Z, Strotmann A. The knowledge base and research front of information science 2006–2010: an author cocitation and bibliographic coupling analysis[J]. Journal of the Association for Information Science and Technology, 2014, 65(5): 995-1006. 43 McCain K W. Mapping economics through the journal literature: an experiment in journal cocitation analysis[J]. Journal of the American Society for Information Science, 1991, 42(4): 290-296. 44 Thijs B, Zhang L, Gl?nzel W. Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes[J]. Scientometrics, 2015, 105(3): 1453-1467. 45 Huang Y, Bu Y, Ding Y, et al. Exploring direct citations between citing publications[J]. Journal of Information Science, 2021, 47(5): 615-626. 46 Huang Y, Bu Y, Ding Y, et al. Number versus structure: towards citing cascades[J]. Scientometrics, 2018, 117(3): 2177-2193. 47 Liu Y M, Yang L, Chen M. A new citation concept: triangular citation in the literature[J]. Journal of Informetrics, 2021, 15(2): 101141. 48 刘运梅, 张帅, 司湘云, 等. 基于内容标注的三角引用动机研究方法探析[J]. 图书情报工作, 2021, 65(10): 48-55. 49 Liu Y M, Chen M. Applying text similarity algorithm to analyze the triangular citation behavior of scientists[J]. Applied Soft Computing, 2021, 107: 107362. 50 杨文霞, 邓三鸿, 胡昊天, 等. 三角引用关系中文献位置的差异对比研究[J]. 信息资源管理学报, 2024, 14(1): 131-145. 51 Beltagy I, Lo K, Cohan A. SciBERT: a pretrained language model for scientific text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2019: 3613-3618. 52 Wang X, Ji H Y, Shi C, et al. Heterogeneous graph attention network[C]// Proceedings of the World Wide Web Conference. New York: ACM Press, 2019: 2022-2032. 53 Wei C H, Allot A, Lai P T, et al. PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge[J]. Nucleic Acids Research, 2024, 52(W1): W540-W546. 54 Liang Z T, Mao J, Lu K, et al. Finding citations for PubMed: a large-scale comparison between five freely available bibliographic data sources[J]. Scientometrics, 2021, 126(12): 9519-9542. |
|
|
|