|
|
Study on Identification and Classification of Emergency News Based on the Combined Deep Learning Model |
Song Yinghua1,2, Lyu Long1,2, Liu Dan1,2 |
1.China Research Center for Emergency Management, Wuhan University of Technology, Wuhan 430070 2.School of Safety Science and Emergency Management, Wuhan University of Technology, Wuhan 430070 |
|
|
Abstract Considering the difference between the keywords in emergency news and general news text, and the existing single consideration of the relationship between words and words and categories based on deep learning news text, we propose a news text classification model based on double input combination deep learning. First, word vector represents the relationship between words and words, discrete vector represents the relationship of words and categories. Second, considering the advantages of the CNN (convolutional neural networks) model learning local spatial feature information, LSTM (long short-term memory) model learning time sequence feature information, and MLP (multilayer perceptron) model learning the relationship between words and categories, we constructed the DCLSTM-MLP deep learning combination model. Finally, this model takes 5,477 emergency news texts with the interrelationship between words and the interrelationship between words and categories, and 2,815 general news texts, and analyzes the performance of the combined model by experimental comparison. The results show that the accuracy, recall rate, and comprehensive value of the first-level emergency identification model all reached 99.55 percent. The accuracy of the second-level emergency classification combination model reached 94.82 percent, and the composite value of the accuracy and recall rate increased by 6.06 percent, 2.36 percent, 2.47 percent, 1.14 percent, and 1.79 percent, respectively, compared with the five models of MLP, Text-CNN, Text-LSTM, CNN-MLP, and CLSTM. Thus, the combined model can implement news text classification tasks more accurately.
|
Received: 25 November 2019
|
|
|
|
1 于朝晖. CNNIC发布第44次《中国互联网络发展状况统计报告》[J]. 网信军民融合, 2019(9): 30-31. 2 郑功成. 全面理解党的十九大报告与中国特色社会保障体系建设[J]. 国家行政学院学报, 2017(6): 8-17, 160. 3 国家互联网信息办公室和公安部联合发布《具有舆论属性或社会动员能力的互联网信息服务安全评估规定》[J]. 网信军民融合, 2018(11): 51-53. 4 李文斌, 刘椿年, 陈嶷瑛. 基于特征信息增益权重的文本分类算法[J]. 北京工业大学学报, 2006, 32(5): 456-460. 5 刘海峰, 王元元, 张学仁. 文本分类中一种改进的特征选择方法[J]. 情报科学, 2007, 25(10): 1534-1537. 6 Piskorski J, Tanev H, Atkinson M, et al. Online news event extraction for global crisis surveillance[M]// Transactions on Computational Collective Intelligence V. Heidelberg: Springer, 2011: 182-212. 7 张永奎, 李红娟. 基于类别关键词的突发事件新闻文本分类方法[J]. 计算机应用, 2008, 28(S1): 139-140, 143. 8 毛文娟. 话题跟踪和可视化技术在涉农网络舆情系统中的应用研究[D]. 南京: 南京农业大学, 2012. 9 王强. 基于SVM的突发事件新闻话题跟踪方法研究[D]. 太原: 山西大学, 2009. 10 Collobert R, Weston J. A unified architecture for natural language processing: deep natural networks with multitask learning[C]// Proceedings of the 25th International Conference on Machine Learning. New York: ACM Press, 2008: 160-167. 11 Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2013: 3111-3119. 12 Kim Y. Convolutional neural networks for sentence classification[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1181. 13 金占勇, 田亚鹏, 白莽. 基于长短时记忆网络的突发灾害事件网络舆情情感识别研究[J]. 情报科学, 2019, 37(5): 142-147, 154. 14 王东波, 高瑞卿, 沈思, 等. 基于深度学习的先秦典籍问句自动分类研究[J]. 情报学报, 2018, 37(11): 1114-1122. 15 刘月, 翟东海, 任庆宁. 基于注意力CNLSTM模型的新闻文本分类[J]. 计算机工程, 2019, 45(7): 303-308, 314. 16 Lai S W, Xu L H, Liu K, et al. Recurrent convolutional neural network for text classification[C]// Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015, 333: 2267-2273. 17 赵容梅, 熊熙, 琚生根, 等. 基于混合神经网络的中文隐式情感分析[J]. 四川大学学报(自然科学版), 2020, 57(2): 264-270. 18 梁志剑, 谢红宇, 安卫钢. 基于BiGRU和贝叶斯分类器的文本分类[J]. 计算机工程与设计, 2020, 41(2): 381-385. 19 金宁, 赵春江, 吴华瑞, 等. 基于BiGRU_MulCNN的农业问答问句分类技术研究[J]. 农业机械学报, 2020, 51(5): 199-206. 20 赵洪. 生成式自动文摘的深度学习方法综述[J]. 情报学报, 2020, 39(3): 330-344. |
|
|
|