|
|
The Novelty Evaluation of Articles on WeChat Subscription Based on Recursive Neural Tensor Network |
Wang Ping1, Hou Jingrui2, Wu Renli2 |
1. Center for Studies of Information Resources, Wuhan University, Wuhan 430072 2. School of Information Management, Wuhan University, Wuhan 430072 |
|
|
Abstract The problem of content homogeneity in We-Media platforms is becoming increasingly serious, making it difficult for users to obtain high-quality information. Therefore, it is particularly important to evaluate the novelty of We-Media articles. Taking the articles of WeChat Subscription as an example, this paper proposes a novelty evaluation method for articles on We-Media platforms, using an unsupervised sentence level Doc2Vec language model to construct the text vector, and establishes a novelty evaluation model to quantify articles’ novelty based on the recursive neural tensor network. This paper automatically collected 4,628 articles from WeChat Subscription to conduct an empirical research. First, a number of different tensor slices were selected to conduct contrastive experiments, and the optimal parameters were obtained by combining the feature of novelty distribution and training time. Subsequently, the linear regression relationship between novelty and similarity was discovered and then verified by calculating the similarity of the documents. The experimental results demonstrate the feasibility and effectiveness of this approach. This paper expands and enriches the research on document novelty evaluation from the perspective of deep learning. It also supports the novel topic detection and frontier knowledge discovery of We-Media platforms.
|
Received: 13 August 2018
|
|
|
|
[1] 熊回香. 面向Web3.0的大众分类研究[D]. 武汉: 华中师范大学, 2011. [2] 代玉梅. 自媒体的传播学解读[J]. 新闻与传播研究, 2011(5): 4-11. [3] Pimentel M A F, Clifton D A, Clifton L, et al. A review of novelty detection[J]. Signal Processing, 2014, 99: 215-249. [4] Markou M, Singh S. Novelty detection: a review—part 2: neural network based approaches[J]. Signal Processing, 2003, 83(12): 2499-2521. [5] 微信. 2017微信数据报告[EB/OL]. [2018-06-09]. http://mp.weixin.qq.com/s/CDh91V9RIcVlAyRoiCOI0Q. [6] 苏正. 微信用户获取信息质量的满意度调查分析[D]. 郑州: 郑州大学, 2017. [7] Merriam-Webster. Novelty[EB/OL]. [2018-06-09]. https://www.merriam-webster.com/dictionary/novelty. [8] Sebasti?o R, Gama J, Rodrigues P P, et al. Monitoring incremental histogram distribution for change detection in data streams[C]// Proceedings of the Second International Workshop on Knowledge Discovery from Sensor Data. Heidelberg: Springer, 2010: 25-42. [9] Faria E R. Novelty detection in data streams[J]. Artificial Intelligence Review, 2016, 45(2): 235-269. [10] Perner P. Concepts for novelty detection and handling based on a case-based reasoning process scheme[J]. Engineering Applications of Artificial Intelligence, 2009, 22(1): 86-91. [11] Kliger M, Fleishman S. Novelty detection with GAN[OL]. https://arxiv.org/abs/1802.10560. [12] 邢美凤, 过仕明. 文本内容新颖度探测研究综述[J]. 情报科学, 2011, 239(7): 1098-1103. [13] 沈阳. 一种基于关键词的创新度评价方法[J]. 情报理论与实践, 2007, 30(1): 125-127. [14] Zhao L, Zhang M, Ma S. The nature of novelty detection[J]. Information Retrieval, 2006, 9(5): 521-541. [15] Allan J, Wade C, Bolivar A. Retrieval and novelty detection at the sentence level[C]// Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2003: 314-321. [16] Kwee A T, Tsai F S, Tang W. Sentence-level novelty detection in English and Malay[M]// Advances in Knowledge Discovery and Data Mining. Heidelberg: Springer, 2009: 40-51. [17] Kouris I N, Makris C H, Tsakalidis A K. Using information retrieval techniques for supporting data mining[J]. Data & Knowledge Engineering, 2005, 52(3): 353-383. [18] Tsai F S, Tang W, Chan K L. Evaluation of novelty metrics for sentence-level novelty mining[J]. Information Sciences, 2010, 180(12): 2359-2374. [19] Spinosa E J, Gama J. Novelty detection with application to data streams[J]. Intelligent Data Analysis, 2009, 13(3): 405-422. [20] Hautamaki V, Karkkainen I, Franti P. Outlier detection using k-nearest neighbour graph[C]// Proceedings of the 17th International Conference on Pattern Recognition. Los Alamitos: IEEE Computer Society Press, 2004: 430-433. [21] 逯万辉, 谭宗颖. 学术成果主题新颖性测度方法研究——基于Doc2Vec和HMM算法[J]. 数据分析与知识发现, 2018(3): 22-29. [22] Fu X Y, Ch??ng E, Aickelin U, et al. An improved system for sentence-level novelty detection in textual streams[C]// Proceedings of the 3rd International Conference on Smart Sustainable City and Big Data. IET, 2016. [23] Blanchard G, Lee G, Scott C. Semi-supervised novelty detection[J]. Journal of Machine Learning Research, 2010, 11: 2973-3009. [24] de Faria E R, de Leon Ferreira Carvalho A C P, Gama J. MINAS: multiclass learning algorithm for novelty detection in data streams[J]. Data Mining and Knowledge Discovery, 2016, 30(3): 640-680. [25] 余骞, 彭智勇, 洪亮, 等. 基于用户邻域和主题的新颖性Web社区推荐方法[J]. 软件学报, 2016, 27(5): 1266-1284. [26] Cichosz P, Jagodziński D, Matysiewicz M, et al. Novelty detection for breast cancer image classification[J]. Proceedings of the SPIE, 2016, 10031: Article ID 1003135. [27] Marchi E, Vesperini F, Squartini S, et al. Deep recurrent neural network-based autoencoders for acoustic novelty detection[J]. Computational Intelligence and Neuroscience, 2017, 2017: Article ID 4694860. [28] Richter C, Roy N. Safe visual navigation via deep learning and novelty detection[C]// Proceedings of Robotics Science and Systems, 2017. [29] Socher R, Perelygin A, Wu J Y, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]// Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2013: 1631-1642. [30] Tsai F S, Zhang Y. D2S: Document-to-sentence framework for novelty detection[J]. Knowledge and Information Systems, 2011, 29(2): 419-433. [31] Le Q, Mikolov T. Distributed representations of sentences and documents[OL]. https://arxiv.org/pdf/1405.4053.pdf. [32] Socher R, Chen D, Manning C D, et al. Reasoning with neural tensor networks for knowledge base completion[C]// Proceedings of the International Conference on Neural Information Processing Systems. Granada: Curran Associates Inc, 2013: 926-934. [33] 搜狗. 微信搜索[EB/OL]. [2018-06-29]. http://weixin.sogou.com/weixin. [34] Tsai M F, Chen H H. Some similarity computation methods in novelty detection[J]. Proceedings of TREC, NIST Special Publication: SP, 2002, 18(1): 655-660. |
|
|
|