|
|
Identifying “Sleeping Beauties” in Cell Biology and Exploring Their Classical Applications through an Improved BP Network and Function Fitting |
Hu Zewen, Jin Xinyue, Cui Jingjing |
School of Management Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044 |
|
|
Abstract Identifying “sleeping beauties” from a large number of studies and recommending them to the scientific community can enable the full use of their scientific and technological value, thus driving the development of science and technology. In this study, we designed and implemented an improved back propagation (BP) neural network model by merging the K-value algorithm, quadratic function fitting method, least squares method, and an iterative algorithm. We then used these methods to identify “sleeping beauties” from 401,130 papers in the field of cell biology, from 1990 to 2010, and explored the classical applications of the identified papers. The results show that: (1) the BP neural network can improve the degree of automation in identifying “sleeping beauties.” However, it is necessary to identify some “sleeping beauties” in advance in a training set to train the recognition model. The improved bivariate quadratic function fitting method and Gini coefficient, based on the least squares method, an iterative algorithm, and a slicing algorithm, demonstrate optimal speed in identifying “sleeping beauties”. (2) The recognition effect of the bivariate quadratic function fitting method is not affected by the length of the citation period. However, the recognition effect of the Gini coefficient is influenced by the length of the citation period. This is illustrated by the fact that the number of identified “sleeping beauties” from papers within a shorter citation period (i.e., published between 2001 and 2010) is 15 times as much as that from papers within a longer citation period (i.e., published between 1990 and 2000). (3) In the same field, there is a difference in the number of “sleeping beauties” identified using different methods. As an illustration, among 257,562 papers, the K-value algorithm, BP neural network model, and quadratic function fitting method with optimal recognition effect can identify between 30 and 223 “sleeping beauties”, an identification percentage that is less than 0.09%. The Gini coefficient with a poorer recognition effect is influenced by the length of the citation period, and identifies a maximum of 1066 “sleeping beauties”; i.e., the percentage increases to 0.41%. (4) The annual distribution of the number of identified “sleeping beauties” is maintained at a stable percentage between 0.02% and 0.17%. (5) The mechanism of identifying “sleeping beauties” can be widely applied in the comparison and analysis of bibliometric features among different types of literatures, as well as the recognition and recommendation of research hotspots. Moreover, the value of the contents of such excellent papers identified as “sleeping beauties” could then be realized.
|
Received: 25 April 2022
|
|
|
|
1 Li J, Shi D B, Zhao S X, et al. A study of the “heartbeat spectra” for“sleeping beauties”[J]. Journal of Informetrics, 2014, 8(3): 493-502. 2 Barber B. Resistance by scientists to scientific discovery[J]. Science, 1961, 134(3479): 596-602. 3 Garfield E. Bradford’s law and related statistical patterns[J]. Current Contents, 1980, 19: 5-12. 4 van Raan A F J. Sleeping beauties in science[J]. Scientometrics, 2004, 59(3): 467-472. 5 梁立明, 林晓锦, 钟镇, 等. 迟滞承认: 科学中的睡美人现象——以一篇被迟滞承认的超弦理论论文为例[J]. 自然辩证法通讯, 2009, 31(1): 39-45, 111. 6 Wang J C, Ma F C, Chen M J, et al. Why and how can “sleeping beauties” be awakened?[J]. Electronic Library, 2012, 30(1): 5-18. 7 Li J, Ye F Y. The phenomenon of all-elements-sleeping-beauties in scientific literature[J]. Scientometrics, 2012, 92(3): 795-799. 8 杜建. “睡美人”文献的识别方法与唤醒机制研究[D]. 南京: 南京大学, 2017. 9 Avramescu A. Actuality and obsolescence of scientific literature[J]. Journal of the American Society for Information Science, 1979, 30(5): 296-303. 10 Aversa E S. Citation patterns of highly cited papers and their relationship to literature aging: a study of the working literature[J]. Scientometrics, 1985, 7(3): 383-389. 11 Costas R, van Leeuwen T N, van Raan A F J. Is scientific literature subject to a ‘Sell-By-Date’? A general methodology to analyze the ‘durability’ of scientific documents[J]. Journal of the American Society for Information Science and Technology, 2010, 61(2): 329-339. 12 Li J. Citation curves of “all-elements-sleeping-beauties”: “flash in the pan” first and then “delayed recognition”[J]. Scientometrics, 2014, 100(2): 595-601. 13 Burrell Q L. Are “sleeping beauties” to be expected?[J]. Scientometrics, 2005, 65(3): 381-389. 14 Ohba N, Nakao K. Sleeping beauties in ophthalmology[J]. Scientometrics, 2012, 93(2): 253-264. 15 袁红, 杭培培. 不同学科领域“睡美人”论文的比较分析[J]. 情报资料工作, 2016, 35(2): 34-38. 16 Ke Q, Emilio F, Filippo R, et al. Defining and identifying sleeping beauties in science[J]. Proceedings of the National Academy of Sciences of the United States of America, 2015, 112(24): 7426-7431. 17 杜建, 武夷山. 一个用于识别睡美人文献的新的无参数指标——基于“Science”和“Nature”上睡美人文献的验证[J]. 情报理论与实践, 2017, 40(2): 19-25. 18 Ye F Y, Bornmann L. “smart girls” versus “sleeping beauties” in the sciences: the identification of instant and delayed recognition by using the citation angle[J]. Journal of the Association for Information Science and Technology, 2018, 69(3): 359-367. 19 Wang J. Citation time window choice for research impact evaluation[J]. Scientometrics, 2013, 94(3): 851-872. 20 Hartley J, Ho Y S. Who woke the sleeping beauties in psychology?[J]. Scientometrics, 2017, 112(2): 1065-1068. 21 Dey R, Roy A, Chakraborty T, et al. Sleeping beauties in Computer Science: characterization and early identification[J]. Scientometrics, 2017, 113(3): 1645-1663. 22 Teixeira A A C, Vieira P C, Abreu A P. Sleeping beauties and their princes in innovation studies[J]. Scientometrics, 2017, 110(2): 541-580. 23 王海燕, 马峥, 高继平, 等. “睡美人”论文与领域主题演变关系研究——以信息安全技术领域睡美人论文为例[J]. 情报学报, 2018, 37(10): 989-996. 24 张慧, 叶鹰. 优质论文中的“天鹅群”及其“伴随睡美人”探析[J]. 情报学报, 2021, 40(6): 603-609. 25 侯剑华, 李昊, 张洋, 等. Altmetrics视角下科学睡美人的演化特征分析[J]. 情报学报, 2021, 40(9): 934-952. 26 Garfield E. More delayed recognition. Part 2. From inhibin to scanning electron microcopy[J]. Current Contents, 1990, 9: 3-9. 27 Gl?nzel W, Schlemmer B, Thijs B. Better late than never? On the chance to become highly cited only beyond the standard bibliometric time horizon[J]. Scientometrics, 2003, 58(3): 571-586. 28 Gl?nzel W, Garfield E. The myth of delayed recognition[J]. The Scientist, 2004, 18(11): 8-9. 29 胡泽文, 武夷山. 科技产出影响因素分析与预测研究——基于多元回归和BP神经网络的途径[J]. 科学学研究, 2012, 30(7): 992-1004. 30 宋呈玉, 李秀霞, 谢瑞霞, 等. 基于二次函数曲线拟合的睡美人文献识别研究[J]. 情报杂志, 2018, 37(6): 119-123, 207. 31 Sun J J, Min C, Li J. A vector for measuring obsolescence of scientific articles[J]. Scientometrics, 2016, 107(2): 745-757. 32 胡泽文, 屈静, 周西姬. 世界顶尖科学家获奖年龄与授奖时滞的周期演变特征——以诺贝尔奖得主为例[J]. 图书馆论坛, 2022, 42(12): 48-56. 33 胡泽文, 孙建军, 武夷山. 国内知识图谱应用研究综述[J]. 图书情报工作, 2013, 57(3): 131-137, 84. 34 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014, 40(1): 39-49. 35 莫富传, 娄策群. 高被引论文应用于研究热点识别的理论依据与路径探索[J]. 情报理论与实践, 2019, 42(4): 59-63, 35. 36 Moral-Munoz J A, Lucena-Antón D, Perez-Cabezas V, et al. Highly cited papers in microbiology: identification and conceptual analysis[J]. FEMS Microbiology Letters, 2018, 365(20): 555-568. 37 邹萍, 董颖, 李莘, 等. 科学知识图谱视角下国内外图情领域危机管理对比研究[J]. 情报科学, 2018, 36(12): 62-65. 38 张敏, 朱明星, 沈雪乐. 国际MIS学科领域研究热点及演化路径分析——基于10本国际权威期刊2004—2013年间的样本分析[J]. 情报杂志, 2015, 34(3): 93-99. 39 van Eck N J, Waltman L. VOSviewer manual[EB/OL]. (2022-01-24) [2022-05-21]. https://www.vosviewer.com/documentation/Manual_VOSviewer_1.6.18.pdf. 40 张垒. 论文高被引的参考文献特征及其对影响因子贡献研究[J]. 情报科学, 2016, 34(8): 94-98. 41 Kolle S R, Vijayashree M, Shankarappa T H. Highly cited articles in malaria research: a bibliometric analysis[J]. Collection Building, 2017, 36(2): 45-57. 42 常青. 情报学研究高被引论文的特征分析[J]. 情报资料工作, 2014, 35(4): 100-102. 43 叶协杰. 我国图书情报学高被引论文热点分析[J]. 图书情报工作, 2007, 51(12): 138-141. 44 Hu Z W, Lin A, Willett P. Identification of research communities in cited and uncited publications using a co-authorship network[J]. Scientometrics, 2019, 118(1): 1-19. 45 Hu Z W, Wu Y S. Regularity in the time-dependent distribution of the percentage of never-cited papers: an empirical pilot study based on the six journals[J]. Journal of Informetrics, 2014, 8(1): 136-146. 46 Hu Z W, Wu Y S, Sun J J. A quantitative analysis of determinants of non-citation using a panel data model[J]. Scientometrics, 2018, 116(2): 843-861. 47 Hu Z W, Wu Y S, Sun J J. A survey-based structural equation model analysis on influencing factors of non-citation[J]. Current Science, 2018, 114(11): 2302-2312. |
|
|
|