|
|
Semantic- and Contextual-based Author Bibliographic Coupling Analysis |
Zhang Ruhao1,2, Yuan Junpeng1,2 |
1.Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100049 2.National Science Library, Chinese Academy of Sciences, Beijing 100190 |
|
|
Abstract Author bibliographic coupling analysis (ABCA) is an important tool for detecting active research communities and mapping domain knowledge structures. ABCA only uses the citation count to naively represent the bibliographic coupling relevance strength between authors, ignoring their similarities at a deep internal level. To enhance the reliability and insight of ABCA, this study attempts to use rich full-text resources to mine key information contained in the citation content, and proposes an innovative method, named semantic- and contextual-based author bibliographic coupling analysis (SC-ABCA), which aims at enhancing the strength of the similarity of citing motives from the essential level. By mining the full-text of scientific literature, the proposed method extracts the features of semantics and context of citations to calculate the enhanced bibliographic coupling strength and assign a different similarity value to each bibliographic couple. Through the “paper-topic-author” aggregation mapping, which considers each author's interests toward multiple current topics, this method shows the topic similarity relationship between authors. This research also performs an empirical study using 13,562 full-text papers in the field of Chinese LIS. The result shows that the proposed method performs better than ABCA on author community discovery and maps a more detailed knowledge structure. It has a higher content coherence gain and the probability of mutual citation inside clusters; additionally, the method is more robust when facing large-scale author data and has potential for further expansion and application.
|
Received: 05 September 2021
|
|
|
|
1 Zhao D Z, Strotmann A. Evolution of research activities and intellectual influences in information science 1996-2005: introducing author bibliographic-coupling analysis[J]. Journal of the American Society for Information Science and Technology, 2008, 59(13): 2070-2086. 2 Ma R M. Author bibliographic coupling analysis: a test based on a Chinese academic database[J] Journal of Informetrics, 2012, 6(4): 532-542. 3 Zhao D Z, Strotmann A. The knowledge base and research front of information science 2006-2010: an author cocitation and bibliographic coupling analysis[J]. Journal of the Association for Information Science and Technology, 2014, 65(5): 995-1006. 4 祝清松, 冷伏海. 引文内容分析方法研究综述[J]. 情报资料工作, 2013(5): 39-43. 5 Gipp B, Beel J. Identifying related documents for research paper recommender by CPA and COA[C]// Proceedings of the International Conference on Education and Information Technology. International Association of Engineers, 2009: 636-639. 6 Kim H J, Jeong Y K, Song M. Content- and proximity-based author co-citation analysis using citation sentences[J]. Journal of Informetrics, 2016, 10(4): 954-966. 7 张汝昊. 基于语义和位置相似的作者共被引分析方法及效果实证[J]. 图书情报工作, 2020, 64(8): 111-124. 8 Kessler M M. Bibliographic coupling between scientific papers[J]. American Documentation, 1963, 14(1): 10-25. 9 Rousseau R. Bibliographic coupling and co-citation as dual notions[C]// The Janus Faced Scholar: a Festschrift in Honour of Peter Ingwersen. Copenhagen: Royal School of Library and Information Science, 2010: 173-183. 10 马瑞敏, 倪超群. 作者耦合分析: 一种新学科知识结构发现方法的探索性研究[J]. 中国图书馆学报, 2012, 38(2): 4-11. 11 Gazni A, Didegah F. The relationship between authors’ bibliographic coupling and citation exchange: analyzing disciplinary differences[J]. Scientometrics, 2016, 107(2): 609-626. 12 邱均平, 董克. 作者共现网络的科学研究结构揭示能力比较研究[J]. 中国图书馆学报, 2014, 40(1): 15-24. 13 Byun J H, Chung E K. Domain analysis on electrical engineering in Korea by author bibliographic coupling analysis[J]. Journal of Information Management, 2011, 42(4): 75-94. 14 王冰璐, 步一, 徐扬. 作者三重耦合分析在知识图谱绘制中的应用研究[J]. 图书情报工作, 2017, 61(7): 96-101. 15 步一, 王冰璐, 徐扬. 结合时间信息的作者耦合分析方法[J]. 情报杂志, 2017, 36(10): 148-151, 158. 16 郝玉珊, 李秀霞. 融入内容的作者文献耦合分析[J]. 情报探索, 2019(2): 24-29. 17 Ding Y, Zhang G, Chambers T, et al. Content-based citation analysis: the next generation of citation analysis[J]. Journal of the Association for Information Science and Technology, 2014, 65(9): 1820-1833. 18 赵蓉英, 曾宪琴, 陈必坤. 全文本引文分析——引文分析的新发展[J]. 图书情报工作, 2014, 58(9): 129-135. 19 Ding Y, Liu X Z, Guo C, et al. The distribution of references across texts: some implications for citation analysis[J]. Journal of Informetrics, 2013, 7(3): 583-592. 20 胡志刚, 陈超美, 刘则渊, 等. 从基于引文到基于引用——一种统计引文总被引次数的新方法[J]. 图书情报工作, 2013, 57(21): 5-10. 21 Hou W R, Li M, Niu D K. Counting citations in texts rather than reference lists to improve the accuracy of assessing scientific contribution[J]. BioEssays, 2011, 33(10): 724-727. 22 Hu Z G, Chen C M, Liu Z Y. Where are citations located in the body of scientific articles? A study of the distributions of citation locations[J]. Journal of Informetrics, 2013, 7(4): 887-896. 23 Bertin M, Atanassova I, Gingras Y, et al. The invariant distribution of references in scientific articles[J]. Journal of the Association for Information Science and Technology, 2016, 67(1): 164-177. 24 Lu C, Ding Y, Zhang C Z. Understanding the impact change of a highly cited article: a content-based citation analysis[J]. Scientometrics, 2017, 112(2): 927-945. 25 Thijs B. Using neural-network based paragraph embeddings for the calculation of within and between document similarities[J]. Scientometrics, 2020, 125(2): 835-849. 26 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014, 40(1): 39-49. 27 Liu S B, Chen C M. The differences between latent topics in abstracts and citation contexts of citing papers[J]. Journal of the American Society for Information Science and Technology, 2013, 64(3): 627-639. 28 Ding Y, Song M, Han J, et al. Entitymetrics: measuring the impact of entities[J]. PLoS One, 2013, 8(8): e71416. 29 徐庶睿, 章成志, 卢超. 利用引文内容进行主题级学科交叉类型分析[J]. 图书情报工作, 2017, 61(23): 15-24. 30 Teufel S, Siddharthan A, Tidhar D. Automatic classification of citation function[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2006: 103-110. 31 Athar A. Sentiment analysis of citations using sentence structure-based features[C]// Proceedings of the ACL 2011 Student Session. Stroudsburg: Association for Computational Linguistics, 2011: 81-87. 32 Prester J, Wagner G, Schryen G, et al. Classifying the ideational impact of Information Systems review articles: a content-enriched deep learning approach[J]. Decision Support Systems, 2021, 140: 113432. 33 Wang M Y, Zhang J Q, Jiao S J, et al. Important citation identification by exploiting the syntactic and contextual information of citations[J]. Scientometrics, 2020, 125(3): 2109-2129. 34 刘兴帮, 陆伟, 孟睿. 基于多标签分类的引文全局功能识别研究[J]. 数字图书馆论坛, 2016(3): 2-9. 35 雷声伟, 陈海华, 黄永, 等. 学术文献引文上下文自动识别研究[J]. 图书情报工作, 2016, 60(17): 78-87. 36 Abu-Jbara A, Radev D. Reference scope identification in citing sentences[C]// Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2012: 80-90. 37 金贤日, 欧石燕. 无监督引用文本自动识别与分析[J]. 数据分析与知识发现, 2021, 5(1): 66-77. 38 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985. 39 王佳敏, 陆伟, 刘家伟, 等. 多层次融合的学术文本结构功能识别研究[J]. 图书情报工作, 2019, 63(13): 95-104. 40 王东波, 高瑞卿, 叶文豪, 等. 不同特征下的学术文本结构功能自动识别研究[J]. 情报学报, 2018, 37(10): 997-1008. 41 秦成磊, 章成志. 基于层次注意力网络模型的学术文本结构功能识别[J]. 数据分析与知识发现, 2020, 4(11): 26-42. 42 Liu S B, Chen C M. The proximity of co-citation[J]. Scientometrics, 2012, 91(2): 495-511. 43 刘盛博, 张春博, 丁堃, 等. 基于引用内容与位置的共被引分析改进研究[J]. 情报学报, 2013, 32(12): 1248-1256. 44 Boyack K W, Small H, Klavans R. Improving the accuracy of co-citation clustering using full text[J]. Journal of the American Society for Information Science and Technology, 2013, 64(9): 1759-1767. 45 Jeong Y K, Song M, Ding Y. Content-based author co-citation analysis[J]. Journal of Informetrics, 2014, 8(1): 197-211. 46 Liu X Z, Zhang J S, Guo C. Full-text citation analysis: a new method to enhance scholarly networks[J]. Journal of the American Society for Information Science and Technology, 2013, 64(9): 1852-1863. 47 Kumar V, Sendhilkumar S, Mahalakshmi G S. Author similarity identification using citation context and proximity[C]// Proceedings of the 2017 Second International Conference on Recent Trends and Challenges in Computational Models. IEEE, 2017: 217-221. 48 Habib R, Afzal M T. Sections-based bibliographic coupling for research paper recommendation[J]. Scientometrics, 2019, 119(2): 643-656. 49 Zhang R H, Yuan J P. Enhanced author bibliographic coupling analysis using semantic and syntactic citation information[J/OL]. Scientometrics, (2022-03-19). https://doi.org/10.1007/s11192-022- 04333-6. 50 Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2013: 3111-3119. 51 Gl?nzel W, Thijs B. Using ‘core documents’ for the representation of clusters and topics[J]. Scientometrics, 2011, 88(1): 297-309. 52 Zhao D Z, Cappello A, Johnston L. Functions of uni-and multi-citations: implications for weighted citation analysis[J]. Journal of Data and Information Science, 2017, 2(1): 51-69. 53 Hassan S U, Safder I, Akram A, et al. A novel machine-learning approach to measuring scientific knowledge flows using citation context analysis[J]. Scientometrics, 2018, 116(2): 973-996. 54 Blondel V D, Guillaume J L, Lambiotte R, et al. Fast unfolding of communities in large networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008(10): P10008. 55 van der Maaten L, Hinton G. Visualizing data using t-SNE[J]. Journal of Machine Learning Research, 2008, 9: 2579-2605. 56 Hinton G, Roweis S. Stochastic neighbor embedding[C]// Proceedings of the 15th International Conference on Neural Information Processing Systems. Cambridge: The MIT Press, 2002: 857-864. |
|
|
|