|
|
Multi-dimensional Analysis of Domain Knowledge Based on a Fine-grained Keyword Citation Network |
Wang Jiamin1, Lu Wei2,3, Cheng Qikai2,3, Qin Chunxiu1 |
1.School of Economics and Management, Xidian University, Xi'an 710126 2.School of Information Management, Wuhan University, Wuhan 430072 3.Information Retrieval and Knowledge Mining Laboratory, Wuhan University, Wuhan 430072 |
|
|
Abstract Due to the lack of a semantic role of nodes and the association relationship between nodes being single in the current keyword citation network, this study enhances the semantic information of nodes and their association relationship through an academic text semantic function and proposes a fine-grained keyword citation network. First, this paper analyzes the content of academic text, extracts the keywords, citation associations, citation context, and citation objects of academic literature and identifies their term and citation functions. Next, the complex network method is utilized to construct a fine-grained keyword citation network. The multi-dimensional analysis of domain knowledge is carried out based on three aspects: citation function-aware subnet analysis, multi-dimensional association analysis of specific nodes, and fine-grained domain knowledge evolution analysis. The ACL domain dataset is taken as an example for empirical research. Results verify the validity of the proposed method; discover the patterns of use, extension, and comparison among domain knowledge; reveal the development of specific research problems or the application of specific methods; and describe the fine-grained evolution of domain knowledge. This study extends the research method and research depth of keyword networks and provides a new perspective and path for multi-dimensional analysis of domain knowledge.
|
Received: 16 August 2021
|
|
|
|
1 Khasseh A A, Soheili F, Moghaddam H S, et al. Intellectual structure of knowledge in iMetrics: a co-word analysis[J]. Information Processing & Management, 2017, 53(3): 705-720. 2 Lu W, Wang J M, Hu J M. Analyzing the topic distribution and evolution of foreign relations from parliamentary debates: a framework and case study[J]. Information Processing & Management, 2020, 57(3): 102191. 3 Lozano S, Calzada-Infante L, Adenso-Díaz B, et al. Complex network analysis of keywords co-occurrence in the recent efficiency analysis literature[J]. Scientometrics, 2019, 120(2): 609-629. 4 章成志, 谢雨欣, 宋云天. 学术文本中细粒度知识实体的关联分析[J]. 图书馆论坛, 2021, 41(3): 12-20. 5 Tosi M D L, dos Reis J C. SciKGraph: a knowledge graph approach to structure a scientific field[J]. Journal of Informetrics, 2021, 15(1): 101109. 6 王忠义, 谭旭, 夏立新. 共词分析方法的细粒度化与语义化研究[J]. 情报学报, 2014, 33(9): 969-978. 7 周萌, 陈果. 科技文本中术语细粒度共现关系抽取与可视化分析[J]. 情报科学, 2019, 37(3): 81-87. 8 吴蕾, 梁晓贺, 宋红燕. 基于超网络的科技论文关键词关联分析[J]. 情报学报, 2020, 39(3): 253-258. 9 Ding Y, Song M, Han J, et al. Entitymetrics: measuring the impact of entities[J]. PLoS One, 2013, 8(8): e71416. 10 Song M, Han N G, Kim Y H, et al. Discovering implicit entity relation with the gene-citation-gene network[J]. PLoS One, 2013, 8(12): e84639. 11 Hsiao T M, Chen K H. The dynamics of research subfields for library and information science: an investigation based on word bibliographic coupling[J]. Scientometrics, 2020, 125(1): 717-737. 12 Cheng Q K, Wang J M, Lu W, et al. Keyword-citation-keyword network: a new perspective of discipline knowledge structure analysis[J]. Scientometrics, 2020, 124(3): 1923-1943. 13 Young S R, Rose D C, Karnowski T P, et al. Optimizing deep learning hyper-parameters through an evolutionary algorithm[C]// Proceedings of the Workshop on Machine Learning in High-Performance Computing Environments. New York: ACM Press, 2015: Article No.4. 14 Ren Z C, Shen Q, Diao X L, et al. A sentiment-aware deep learning approach for personality detection from text[J]. Information Processing & Management, 2021, 58(3): 102532. 15 Cambrosio A, Cointet J P, Abdo A. Beyond networks: aligning qualitative and computational science studies[J]. Quantitative Science Studies, 2020, 1(9): 1017-1024. 16 程齐凯, 李鹏程, 张国标, 等. 学术文本词汇功能识别——基于标题生成策略和注意力机制的问题方法抽取[J]. 情报学报, 2021, 40(1): 43-52. 17 Kondo T, Nanba H, Takezawa T, et al. Technical trend analysis by analyzing research papers’ titles[C]// Proceedings of the Language and Technology Conference. Heidelberg: Springer, 2011: 512-521. 18 Gupta S, Manning C. Analyzing the dynamics of research by extracting key aspects of scientific papers[C]// Proceedings of the 5th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 2011: 1-9. 19 Lu W, Li X, Liu Z F, et al. How do author-selected keywords function semantically in scientific manuscripts?[J]. Knowledge Organization, 2019, 46(6): 403-418. 20 陆伟, 李鹏程, 张国标, 等. 学术文本词汇功能识别——基于BERT向量化表示的关键词自动分类研究[J]. 情报学报, 2020, 39(12): 1320-1329. 21 陆伟, 孟睿, 刘兴帮. 面向引用关系的引文内容标注框架研究[J]. 中国图书馆学报, 2014, 40(6): 93-104. 22 Peritz B C. A classification of citation roles for the social sciences and related fields[J]. Scientometrics, 1983, 5(5): 303-312. 23 Teufel S, Siddharthan A, Tidhar D. Automatic classification of citation function[C]// Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2006: 103-110. 24 黄文彬, 王冰璐, 步一, 等. 关键词共引分析的科学计量方法研究[J]. 情报资料工作, 2018(2): 37-42. 25 程齐凯, 王佳敏, 陆伟. 基于引用共词网络的领域基础词汇发现研究[J]. 数据分析与知识发现, 2019, 3(6): 57-65. 26 刘臣, 张庆普, 单伟, 等. 基于语义的社会网络关联路径评价及其应用[J]. 情报学报, 2011, 30(2): 172-182. 27 张晗, 赵玉虹. 医学文献语义共词知识网的构建: 方法与实证[J]. 图书情报工作, 2016, 60(11): 135-142. 28 陈翔, 黄璐, 倪兴兴, 等. 基于动态语义网络分析的主题演化路径识别研究[J]. 情报学报, 2021, 40(5): 500-512. 29 Ma J X, Lund B. The evolution and shift of research topics and methods in library and information science[J]. Journal of the Association for Information Science and Technology, 2021, 72(8): 1059-1074. 30 孙震, 冷伏海. 一种基于知识元迁移的ESI研究前沿知识演进分析方法[J]. 情报学报, 2021, 40(10): 1027-1042. 31 索传军, 李木子. 我国学术论文研究问题探析——基于2015—2020年图情领域CSSCI发表论文的实证研究[J]. 图书情报工作, 2021, 65(19): 105-116. 32 章成志, 张颖怡. 基于学术论文全文的研究方法实体自动识别研究[J]. 情报学报, 2020, 39(6): 589-600. 33 Heffernan K, Teufel S. Identifying problems and solutions in scientific text[J]. Scientometrics, 2018, 116(2): 1367-1382. 34 Safder I, Hassan S U, Visvizi A, et al. Deep learning-based extraction of algorithmic metadata in full-text scholarly documents[J]. Information Processing & Management, 2020, 57(6): 102269. 35 Lopez P. GROBID: combining automatic bibliographic data recognition and term extraction for scholarship publications[C]// Proceedings of the International Conference on Theory and Practice of Digital Libraries. Heidelberg: Springer, 2009: 473-474. 36 Hernández-Alvarez M, Gomez Soriano J M, Martínez-Barco P. Citation function, polarity and influence classification[J]. Natural Language Engineering, 2017, 23(4): 561-588. 37 Cohen J. A coefficient of agreement for nominal scales[J]. Educational and Psychological Measurement, 1960, 20(1): 37-46. 38 马娜, 张智雄, 吴朋民. 基于特征融合的术语型引用对象自动识别方法研究[J]. 数据分析与知识发现, 2020, 4(1): 89-98. 39 余丽, 钱力, 付常雷, 等. 基于深度学习的文本中细粒度知识元抽取方法研究[J]. 数据分析与知识发现, 2019, 3(1): 38-45. 40 Sennrich R, Haddow B, Birch A. Neural machine translation of rare words with subword units[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1715-1725. |
|
|
|