Essential Reference Measurements from the Perspective of Full-Text: Concept Definition, Index System, and Identification Model |
Lin Gege1, Hou Haiyan1, Pan Yuxin1, Liang Guoqiang2, Hu Zhigang3 |
1.School of Public Administration and Policy, Dalian University of Technology, Dalian 116024 2.College of Economics and Management, Beijing University of Technology, Beijing 100124 3.Institute for Science, Technology and Society, South China Normal University, Guangzhou 510006 |
Abstract Identifying essential references within citing documents is fundamental for conducting thorough evaluations of scientific achievements. Therefore, this study explores the measurement of essential references from the perspective of full text that includes the definition of concepts, construction of an indicator system, and optimization of identification models, thereby providing a more precise scientific evaluation tool. First, the definition of essential references was clarified, and an indicator system for identifying essential references was constructed, encompassing two dimensions (bibliographic and citation information), eight sub-dimensions, and 33 citation feature indicators. Second, by utilizing various machine learning models, such as random forest, support vector machine, and logistic regression, citation feature indicators were selected and optimized. Their correlations and information gains were analyzed, and 21 important citation feature indicators were retained, to validate the effectiveness of the identification models. The results indicate that citation feature indicators based on citation information hold greater importance and contribute more to the identification of essential references. The performance of machine learning models in identifying essential references was excellent, particularly for the random forest, support vector machine, and logistic regression models, with area under receiver operating characteristic curve (AUC) values exceeding 0.85, demonstrating the efficiency and robustness of the models. The core citation measurement methods and identification models not only provide more accurate tools for scientific evaluation systems but also lay a solid foundation for further in-depth research into citation analysis.
Received: 22 December 2023
