|
|
Using Unsupervised Graphs of Neural Networks for Constructing Learning Representations of Academic Papers |
Ding Heng, Ren Weiqiang, Cao Gaohui |
School of Information Management, Central China Normal University, Wuhan 430079 |
|
|
Abstract Constructing feature representations of academic papers is a key step in providing scholarly big data services, such as academic searches, literature classification and organization, and personalized paper recommendations. The current research shows that using graphs of neural networks can help researchers learn how to effectively construct representations of academic papers, but most have focused on a supervised learning approach, which requires massive amounts of high-quality data. Based on this context, in this study uses four unsupervised graphs of neural networks for learning to construct representations of academic papers using data from the Cora, CiterSeer, and DBLP datasets; then, the representations of the papers are applied to completing two downstream tasks, i.e., literature classification and paper recommendation. The experimental results show that: (1) deep graph infomax performs best for literature classification, and an adversarially regularized graph autoencoder exhibits better performance in paper recommendation; and (2) citation networks perform better than co-citation networks and literature coupling networks on the Cora dataset.
|
Received: 27 March 2021
|
|
|
|
1 Landhuis E. Scientific literature: information overload[J]. Nature, 2016, 535(7612): 457-458. 2 Sulova S, Todoranova L, Penchev B, et al. Using text mining to classify research papers[C]// Proceedings of the 17th International Multidisciplinary Scientific GeoConference, Bulgaria, Albena, 2017: 647-654. 3 Chandrasekaran K, Gauch S, Lakkaraju P, et al. Concept-based document recommendations for CiteSeer authors[C]// Proceedings of the International Conference on Adaptive Hypermedia and Adaptive Web-Based Systems. Heidelberg: Springer, 2008: 83-92. 4 Sun C, Qiu X P, Xu Y G, et al. How to fine-tune BERT for text classification?[C]// Proceedings of the China National Conference on Chinese Computational Linguistics. Cham: Springer, 2019: 194-206. 5 Hassan H, Sansonetti G, Gasparetti F, et al. BERT, ELMo, USE and InferSent sentence encoders: the panacea for research-paper recommendation?[C]// Proceedings of the 13th ACM Conference on Recommender Systems. New York: ACM Press, 2019: 6-10. 6 Kong X J, Mao M Y, Wang W, et al. VOPRec: vector representation learning of papers with text information and structural identity for recommendation[J]. IEEE Transactions on Emerging Topics in Computing, 2021, 9(1): 226-237. 7 刘欢, 李晓戈, 胡立坤, 等. 基于知识图谱驱动的图神经网络推荐模型[J]. 计算机应用, 2021, 41(7): 1865-1870. 8 吴国栋, 查志康, 涂立静, 等. 图神经网络推荐研究进展[J]. 智能系统学报, 2020, 15(1): 14-24. 9 Wu Z H, Pan S R, Chen F W, et al. A comprehensive survey on graph neural networks[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 32(1): 4-24. 10 Reid Turner C, Fuggetta A, Lavazza L, et al. A conceptual basis for feature engineering[J]. Journal of Systems and Software, 1999, 49(1): 3-15. 11 Bengio Y, Courville A, Vincent P. Representation learning: a review and new perspectives[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1798-1828. 12 Krizhevsky A, Sutskever I, Hinton G. ImageNet classification with deep convolutional neural networks[C]// Proceedings of the 25th International Conference on Neural Information Processing Systems. New York: ACM Press, 2012: 1097-1105. 13 Noda K, Yamaguchi Y, Nakadai K, et al. Audio-visual speech recognition using deep learning[J]. Applied Intelligence, 2015, 42(4): 722-737. 14 Church K W. Word2Vec[J]. Natural Language Engineering, 2017, 23(1): 155-162. 15 Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018: 2227-2237. 16 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 17 Beltagy I, Lo K, Cohan A. SciBERT: a pretrained language model for scientific text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2019: 3615-3620. 18 王健宗, 孔令炜, 黄章成, 等. 图神经网络综述[J]. 计算机工程, 2021, 47(4): 1-12. 19 文贵华, 江丽君, 文军. 邻域参数动态变化的局部线性嵌入[J]. 软件学报, 2008, 19(7): 1666-1673. 20 戴志波, 王靖. 鲁棒拉普拉斯特征映射算法[J]. 计算机应用研究, 2011, 28(9): 3249-3252. 21 Ahmed A, Shervashidze N, Narayanamurthy S, et al. Distributed large-scale natural graph factorization[C]// Proceedings of the 22nd International Conference on World Wide Web. New York: ACM Press, 2013: 37-48. 22 Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations[C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2014: 701-710. 23 Grover A, Leskovec J. node2vec: scalable feature learning for networks[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 855-864. 24 马扬, 程光权, 梁星星, 等. 有向加权网络中的改进SDNE算法[J]. 计算机科学, 2020, 47(4): 233-237. 25 Kipf T N, Welling M. Semi-supervised classification with graph convolutional networks[C]// Proceedings of the International Conference on Learning Representations, Toulon, France, April 24 - 26, 2017. 26 Kipf T N, Welling M. Variational graph auto-encoders[OL]. (2016-11-21) [2021-03-16]. https://arxiv.org/pdf/1611.07308v1.pdf. 27 Pan S R, Hu R Q, Long G D, et al. Adversarially regularized graph autoencoder for graph embedding[C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 2609-2615. 28 Veli?kovi? P, Fedus W, Hamilton W L, et al. Deep graph infomax[OL]. (2018-12-21) [2021-03-16]. https://arxiv.org/pdf/1809.10341v2.pdf. 29 Le Q, Mikolov T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, 2014: II-1188-II-1196. 30 Ganguly S, Pudi V. Paper2vec: combining graph and text information for scientific paper representation[C]// Proceedings of the European Conference on Information Retrieval. Cham: Springer, 2017: 383-395. |
|
|
|