|
|
Identifying Topic Evolutionary Pathways through Dynamic Semantic Network Analytics |
Chen Xiang, Huang Lu, Ni Xingxing, Liu Jiarun, Cao Xiaoli, Wang Changtian |
School of Management and Economics, Beijing Institute of Technology, Beijing 100081 |
|
|
Abstract With the rapid development of science and technology, several disciplines have shown accelerating changes and intensified cross-fusion. In this context, an important problem researchers face is how research topics can be quickly and accurately identified, evolutionary pathways and trends can be tracked, and research frontiers can be subsequently comprehended. This paper therefore proposes a method for the identification of topic evolutionary pathways based on a dynamic network. First, the dynamic keyword network is constructed via introduction of the piecewise linear representation and the Word2Vec model. Second, a community discovery algorithm is used to identify the communities in the dynamic network, and the evolutionary relationship among topics is represented via measurements of the topic similarity between adjacent time intervals. Finally, the topic evolutionary pathway is identified. This study involves empirical analyses in information science. For validation of the methodology, the results obtained via using the piecewise linear representation method are compared with those obtained via the average time-division method and also with the effect of our method with K-means and LDA in topic identification. This study can therefore provide important decision support for researchers and strategic decision-makers to perform research activities aimed at progressing in the field of study.
|
Received: 13 August 2020
|
|
|
|
1 叶春蕾, 冷伏海. 基于共词分析的学科主题演化方法改进研究[J]. 情报理论与实践, 2012, 35(3): 79-82. 2 焦红, 李秀霞. 基于研究主题的学科领域知识演化路径识别——以图书情报领域粗糙集为例[J]. 情报理论与实践, 2019, 42(3): 101-106. 3 曲佳彬, 欧石燕. 基于主题过滤与主题关联的学科主题演化分析[J]. 数据分析与知识发现, 2018, 2(1): 64-75. 4 Wang X G, Cheng Q K, Lu W. Analyzing evolution of research topics with NEViewer: a new method based on dynamic co-word networks[J]. Scientometrics, 2014, 101(2): 1253-1271. 5 宫小翠, 安新颖. 基于LDA模型的医学领域主题分裂融合探测[J]. 图书情报工作, 2017, 61(18): 76-83. 6 巴志超, 杨子江, 朱世伟, 等. 基于关键词语义网络的领域主题演化分析方法研究[J]. 情报理论与实践, 2016, 39(3): 67-72. 7 黄璐, 朱一鹤, 张嶷. 基于加权网络链路预测的新兴技术主题识别研究[J]. 情报学报, 2019, 38(4): 335-341. 8 Katsurai M, Ono S. TrendNets: mapping emerging research trends from dynamic co-word networks via sparse representation[J]. Scientometrics, 2019, 121(3): 1583-1598. 9 侯剑华, 吕东博, 王鹏. 从硕士学位论文看我国科学技术哲学研究的转向——基于对硕士学位论文的计量分析[J]. 黑龙江高教研究, 2014, 32(2): 7-10. 10 Ding W Y, Chen C M. Dynamic topic detection and tracking: a comparison of HDP, C-word, and cocitation methods[J]. Journal of the Association for Information Science and Technology, 2014, 65(10): 2084-2097. 11 Song M, Heo G E, Kim S Y. Analyzing topic evolution in bioinformatics: investigation of dynamics of the field with conference data in DBLP[J]. Scientometrics, 2014, 101(1): 397-428. 12 王曰芬, 傅柱, 陈必坤. 基于LDA主题模型的科学文献主题识别:全局和学科两个视角的对比分析[J]. 情报理论与实践, 2016, 39(7): 121-126, 101. 13 张嶷, 汪雪锋, 朱东华, 等. 主题词簇方法研究[J]. 科学学研究, 2013, 31(11): 1615-1622. 14 Porter A, Zhang Y, Sakurai S. Text clumping for technical intelligence[M]// Theory and Applications for Advanced Text Mining. Croatia: InTech Publishing, 2012. 15 Wallace M L, Gingras Y, Duhon R. A new approach for detecting scientific specialties from raw cocitation networks[J]. Journal of the American Society for Information Science and Technology, 2009, 60(2): 240-246. 16 Blondel V D, Guillaume J L, Lambiotte R, et al. Fast unfolding of communities in large networks[J]. Journal of Statistical Mechanics: Theory and Experiment, 2008, 2008(10): P10008. 17 曾庆田, 胡晓慧, 李超. 融合主题词嵌入和网络结构分析的主题关键词提取方法[J]. 数据分析与知识发现, 2019, 3(7): 52-60. 18 Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[OL]. (2013-09-07). https://arxiv.org/pdf/1301.3781.pdf. 19 陈虹枢. 基于主题模型的专利文本挖掘方法及应用研究[D]. 北京: 北京理工大学, 2015. 20 王飞, 谭新. 一种基于Word2Vec的训练效果优化策略研究[J]. 计算机应用与软件, 2018, 35(1): 97-102, 174. 21 Newman M E J. Communities, modules and large-scale structure in networks[J]. Nature Physics, 2012, 8(1): 25-31. 22 Guimerà R, Sales-Pardo M, Amaral L A N. Classes of complex networks defined by role-to-role connectivity profiles[J]. Nature Physics, 2007, 3(1): 63-69. 23 Palla G, Barabási A L, Vicsek T. Quantifying social group evolution[J]. Nature, 2007, 446(7136): 664-667. 24 林江豪, 周咏梅, 阳爱民, 等. 结合词向量和聚类算法的新闻评论话题演进分析[J]. 计算机工程与科学, 2016, 38(11): 2368-2374. 25 Hou J H, Yang X C, Chen C M. Emerging trends and new developments in information science: a document co-citation analysis (2009-2016)[J]. Scientometrics, 2018, 115(2): 869-892. 26 Wang Y, Liu Z, Sun M. Incorporating linguistic knowledge for learning distributed word representations[J]. PLoS One, 2015, 10(4): e0118437. 27 Schwartz R, Reichart R, Rappoport A. Symmetric pattern based word embeddings for improved word similarity prediction[C]// Proceedings of the Nineteenth Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2015: 258-267. 28 Vaio G D, Weisdorf J L. Ranking economic history journals: a citation-based impact-adjusted analysis[J]. Cliometrica, 2009, 4(1): 1-17. 29 Kralj J, Valmarska A, Robnik-?ikonja M, et al. Mining text enriched heterogeneous citation networks[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer International Publishing, 2015: 672-683. 30 Sud P, Thelwall M. Evaluating altmetrics[J]. Scientometrics, 2014, 98(2): 1131-1143. |
|
|
|