|
|
Research Interest Space Mining and Double-Task Research Recommendation Based on Heterogeneous Network Embedding |
Cui Hongfei1, Feng Zihan2, Zhang Jingyu3 |
1.School of Economics and Management, University of Science and Technology Beijing, Beijing 100083 2.School of Economics and Management, Beihang University, Beijing 100191 3.University of Edinburgh Business School, the University of Edinburgh, Edinburgh EH8 9JU |
|
|
Abstract The Internet contains rich literature databases that are crucial resources for researchers to understand various advances in their fields. Efficient information mining on a massive scale of research outcomes within specific fields can provide researchers globally with clearer directions amidst knowledge flow. In this regard, based on more than 130,000 scientific papers spanning 11 years (2010-2021) collected from a well-known biomedical literature database, PubMed, this study investigated the historical behavioral information of researchers. A heterogeneous information network containing authors, papers, and keywords was built, in which the nodes were subsequently embedded into a “heterogeneous research interest space” using the metapath2vec heterogeneous network embedding algorithm. Simultaneously, both research collaborator and keyword recommendation tasks were implemented based on the similarity metric between vectors in the space. Compared with existing studies, the proposed method can achieve double-task recommendation, obtaining not only meaningful results in the traditional research collaborator recommendation task but also significantly surpassing existing performance in heterogeneous keyword recommendation between researchers and keywords. Furthermore, an in-depth mining of author and keyword vectors in the space was conducted, proving that the heterogeneous research interest space can indeed provide inductive direction in understanding the semantic meaning of researchers’ interests in various research fields. The characteristics of the research interest space will provide new perspectives for researchers in terms of mining and comprehensive understanding of research interests.
|
Received: 02 January 2023
|
|
|
|
1 张金柱, 于文倩, 刘菁婕, 等. 基于网络表示学习的科研合作预测研究[J]. 情报学报, 2018, 37(2): 132-139. 2 Lu W, Huang S Z, Yang J Q, et al. Detecting research topic trends by author-defined keyword frequency[J]. Information Processing & Management, 2021, 58(4): 102594. 3 余传明, 龚雨田, 赵晓莉, 等. 基于多特征融合的金融领域科研合作推荐研究[J]. 数据分析与知识发现, 2017, 1(8): 39-47. 4 余传明, 林奥琛, 钟韵辞, 等. 基于网络表示学习的科研合作推荐研究[J]. 情报学报, 2019, 38(5): 500-511. 5 刘云枫, 孙平, 葛志远. 基于网络表示学习的作者合作推荐模型[J]. 情报科学, 2020, 38(2): 75-80. 6 岳峰, 王含茹, 张馨悦, 等. 科研社交网络中基于异质网络分析的列表级排序学习推荐方法研究[J]. 计算机应用研究, 2020, 37(12): 3552-3556, 3564. 7 王鑫. 基于网络表示学习的学术合作者推荐问题研究[D]. 合肥: 安徽大学, 2020. 8 刘萍, 郑凯伦, 邹德安. 基于LDA模型的科研合作推荐研究[J]. 情报理论与实践, 2015, 38(9): 79-85. 9 关盼盼. 基于LDA模型与加权GN网络的科研合作推荐方法[D]. 秦皇岛: 燕山大学, 2017. 10 Yan E J, Guns R. Predicting and recommending collaborations: an author-, institution-, and country-level analysis[J]. Journal of Informetrics, 2014, 8(2): 295-309. 11 吕伟民, 王小梅, 韩涛. 结合链路预测和ET机器学习的科研合作推荐方法研究[J]. 数据分析与知识发现, 2017, 1(4): 38-45. 12 秦红武, 赵猛, 马秀琴, 等. 基于学术水平聚类的科研合作者推荐模型[J]. 计算机工程与应用, 2022, 58(21): 172-181. 13 Wang D X, Cui P, Zhu W W. Structural deep network embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 1225-1234. 14 Grover A, Leskovec J. node2vec: scalable feature learning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 855-864. 15 张鑫, 文奕, 许海云. 一种融合表示学习与主题表征的作者合作预测模型[J]. 数据分析与知识发现, 2021, 5(3): 88-100. 16 林原, 王凯巧, 刘海峰, 等. 网络表示学习在学者科研合作预测中的应用研究[J]. 情报学报, 2020, 39(4): 367-373. 17 Dong Y X, Chawla N V, Swami A. metapath2vec: scalable representation learning for heterogeneous networks[C]// Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2017: 135-144. 18 He Y, Song Y Q, Li J X, et al. HeteSpaceyWalk: a heterogeneous spacey random walk for heterogeneous information network embedding[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2019: 639-648. 19 Hussein R, Yang D Q, Cudré-Mauroux P. Are meta-paths necessary? Revisiting heterogeneous graph embeddings[C]// Proceedings of the 27th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2018: 437-446. 20 Lee S, Park C, Yu H. BHIN2vec: balancing the type of relation in heterogeneous information network[C]// Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2019: 619-628. 21 Wang X, Zhang Y D, Shi C. Hyperbolic heterogeneous information network embedding[C]// Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019: 5337-5344. 22 Chen H X, Yin H Z, Wang W Q, et al. PME: projected metric embedding on heterogeneous networks for link prediction[C]// Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM Press, 2018: 1177-1186. 23 Xu L C, Wei X K, Cao J N, et al. Embedding of embedding (EOE): joint embedding for coupled heterogeneous networks[C]// Proceedings of the 10th ACM International Conference on Web Search and Data Mining. New York: ACM Press, 2017: 741-749. 24 Cen Y K, Zou X, Zhang J W, et al. Representation learning for attributed multiplex heterogeneous network[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM Press, 2019: 1358-1368. 25 Fu X Y, Zhang J N, Meng Z Q, et al. MAGNN: metapath aggregated graph neural network for heterogeneous graph embedding[C]// Proceedings of the Web Conference 2020. New York: ACM Press, 2020: 2331-2341. 26 Hu Z N, Dong Y X, Wang K S, et al. Heterogeneous graph transformer[C]// Proceedings of the Web Conference 2020. New York: ACM Press, 2020: 2704-2710. 27 Zhang C X, Swami A, Chawla N V. SHNE: representation learning for semantic-associated heterogeneous networks[C]// Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. New York: ACM Press, 2019: 690-698. 28 Zhang C X, Song D J, Huang C, et al. Heterogeneous graph neural network[C]// Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM Press, 2019: 793-803. 29 Feng Z H, Cui H F. IRIS: learning the underlying information of scientific research interests using heterogeneous network representation[C]// Proceedings of the 55th Hawaii International Conference on System Sciences, 2022: 2773-2782. 30 吴振新, 单嵩岩. 科学家相关性测度典型算法比较与评析[J]. 数字图书馆论坛, 2019(3): 11-17. 31 Shi Y, Gui H, Zhu Q, et al. AspEm: embedding learning by aspects in heterogeneous information networks[C]// Proceedings of the SIAM International Conference on Data Mining, 2018: 144-152. 32 Tang J, Qu M, Wang M Z, et al. LINE: large-scale information network embedding[C]// Proceedings of the 24th International Conference on World Wide Web. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2015: 1067-1077. 33 Sun Y Z, Han J W, Yan X F, et al. Pathsim: meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 992-1003. 34 Tang J, Zhang J, Yao L M, et al. ArnetMiner: extraction and mining of academic social networks[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2008: 990-998. 35 Gan M X, Cui H F. Exploring user movie interest space: a deep learning based dynamic recommendation model[J]. Expert Systems with Applications, 2021, 173: 114695. |
|
|
|