|
|
Research on Automatic Identification and Classification of Actionable Information in Emergencies |
Wu Xuehua, Mao Jin, Chen Sijing, Xie Hao, Li Gang |
Center for Studies of Information Resources, Wuhan University, Wuhan 430072 |
|
|
Abstract Actionable information enables a timely response and flexible mobilization during emergencies. It plays a critical role in supporting rescue, resource allocation, and other response efforts, as well as in minimizing casualties and damages. This study aims to explore how various actionable information can be obtained from social media. To this end, we clarify the concept, characteristics, and categories of actionable information and propose a two-stage framework to identify and classify actionable information relying on machine learning techniques. Word vector representations, linguistic features, surface-based features and user-based features are adopted in the proposed framework. Five machine learning algorithms, namely, Support Vector Machine, Logistic Regression, TextCNN, BERT, and a combination of BERT and TextCNN (BERT + TextCNN), were explored in our experiment based on a manually annotated dataset. The performances of different classification strategies, algorithms, and features were evaluated. The experiment results indicate that two-stage approach can provide actionable information with various granularities without sacrificing the performance. BERT and BERT + TextCNN outperform other models in both stages. A combination of linguistic, surface-based, and user-based features makes little contribution to the information identification task in the first stage, while it can significantly improve the performance of the information classification task in the second stage. Our research helps better incorporate social media streams into emergency workflows. This can, to some extent, mitigate information overload during emergency response and improve the response efficiency.
|
Received: 13 August 2020
|
|
|
|
1 蒋勋, 苏新宁, 刘喜文. 突发事件驱动的应急决策知识库结构研究[J]. 情报资料工作, 2015(1): 25-29. 2 Sakaki T, Okazaki M, Matsuo Y. Earthquake shakes Twitter users: real-time event detection by social sensors[C]// Proceedings of the 19th International Conference on World Wide Web. New York: ACM Press, 2010: 851-860. 3 Imran M, Castillo C, Diaz F, et al. Processing social media messages in mass emergency: a survey[J]. ACM Computing Surveys, 2015, 47(4): 1-38. 4 Vieweg S, Hughes A L, Starbird K, et al. Microblogging during two natural hazards events: what twitter may contribute to situational awareness[C]// Proceedings of the 28th International Conference on Human Factors in Computing Systems. New York: ACM Press, 2010: 1079-1088. 5 Zade H, Shah K, Rangarajan V, et al. From situational awareness to actionability: towards improving the utility of social media data for crisis response[J]. Proceedings of the ACM on Human-Computer Interaction, 2018, 2(CSCW): Article No. 195. 6 Imran M, Castillo C, Diaz F, et al. Processing social media messages in mass emergency: survey summary[C]// Companion Proceedings of the Web Conference 2018. Republic and Canton of Geneva: International World Wide Web Conferences Steering Committee, 2018: 507-511. 7 Kropczynski J, Grace R, Coche J, et al. Identifying actionable information on social media for emergency dispatch[C]// Proceedings of the 2018 International Conference on Information Systems for Crisis Response and Management Asia Pacific. Auckland: Massey University, 2018: 428-438. 8 Dailey D, Starbird K. Addressing the information needs of crisis-affected communities: the interplay of legacy media and social media in a rural disaster[M]// The Communication Crisis in America, and How to Fix It. New York: Palgrave Macmillan, 2016: 285-303. 9 Purohit H, Castillo C, Diaz F, et al. Emergency-relief coordination on social media: automatically matching resource requests and offers[J/OL]. First Monday, 2013, 19(1). https://doi.org/10.5210/fm.v19i1.4848. 10 Sit M A, Koylu C, Demir I. Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma[J]. International Journal of Digital Earth, 2019, 12(11): 1205-1229. 11 仇培元, 陆锋, 张恒才, 等. 蕴含地理事件微博客消息的自动识别方法[J]. 地球信息科学学报, 2016, 18(7): 886-893. 12 Imran M, Elbassuoni S, Castillo C, et al. Extracting information nuggets from disaster-related messages in social media[C]// Proceedings of the 10th International Conference on Information Systems for Crisis Response and Management. ISCRAM Association, 2013. 13 李纲, 陈思菁, 毛进, 等. 自然灾害事件微博热点话题的时空对比分析[J]. 数据分析与知识发现, 2019, 3(11): 1-15. 14 Rudra K, Ganguly N, Goyal P, et al. Extracting and summarizing situational information from the Twitter social media during disasters[J]. ACM Transactions on the Web, 2018, 12(3): 1-35. 15 Munro R. Subword and spatiotemporal models for identifying actionable information in Haitian Kreyol[C]// Proceedings of the Fifteenth Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2011: 68-77. 16 Baweja S, Aggarwal A, Goyal V, et al. Automatic retrieval of actionable information from disaster-related microblogs[J]. CEUR Workshop Proceedings, 2017, 2036: 43-45. 17 Kiatpanont R, Tanlamai U, Chongstitvatana P. Extraction of actionable information from crowdsourced disaster data[J]. Journal of Emergency Management, 2016, 14(6): 377-390. 18 Nguyen M T, Kitamoto A, Nguyen T T. TSum4act: a framework for retrieving and summarizing actionable tweets during a disaster for reaction[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer, 2015: 64-75. 19 Ghosh S, Srijith P K, Desarkar M S. Using social media for classifying actionable insights in disaster scenario[J]. International Journal of Advances in Engineering Sciences and Applied Mathematics, 2017, 9(4): 224-237. 20 Zahra K, Imran M, Ostermann F O. Automatic identification of eyewitness messages on twitter during disasters[J]. Information Processing & Management, 2020, 57(1): 102107. 21 夏华林, 张仰森. 基于规则与统计的Web突发事件新闻多层次分类[J]. 计算机应用, 2012, 32(2): 392-394, 415. 22 Ao J, Zhang P, Cao Y N. Estimating the locations of emergency events from Twitter streams[J]. Procedia Computer Science, 2014, 31: 731-739. 23 Habdank M, Rodehutskors N, Koch R. Relevancy assessment of tweets using supervised learning techniques: mining emergency related tweets for automated relevancy classification[C]// Proceedings of the 4th International Conference on Information and Communication Technologies for Disaster Management. New York: IEEE, 2017: 1-8. 24 Kaufhold M A, Bayer M, Reuter C. Rapid relevance classification of social media posts in disasters and emergencies: a system and evaluation featuring active, incremental and online learning[J]. Information Processing & Management, 2020, 57(1): 102132. 25 Pohl D, Bouchachia A, Hellwagner H. Batch-based active learning: application to social media data for crisis management[J]. Expert Systems with Applications, 2018, 93: 232-244. 26 Huang Q Y, Xiao Y. Geographic situational awareness: mining tweets for disaster preparedness, emergency response, impact, and recovery[J]. ISPRS International Journal of Geo-Information, 2015, 4(3): 1549-1568. 27 王艳东, 李昊, 王腾, 等. 基于社交媒体的突发事件应急信息挖掘与分析[J]. 武汉大学学报?信息科学版, 2016, 41(3): 290-297. 28 Kozlowski D, Lannelongue E, Saudemont F, et al. A three-level classification of French tweets in ecological crises[J]. Information Processing & Management, 2020, 57(5): 102284. 29 刘淑涵, 王艳东, 付小康. 利用卷积神经网络提取微博中的暴雨灾害信息[J]. 地球信息科学学报, 2019, 21(7): 1009-1017. 30 Yu M Z, Huang Q Y, Qin H, et al. Deep learning for real-time social media text classification for situation awareness-using Hurricanes Sandy, Harvey, and Irma as case studies[J]. International Journal of Digital Earth, 2019, 12(11): 1230-1247. 31 Zahera H M, Elgendy I, Jalota R, et al. Fine-tuned BERT model for multi-label tweets classification[C/OL]// Proceedings of the Twenty-Eighth Text REtrieval Conference, 2019. https://trec.nist.gov/pubs/trec28/papers/DICE_UPB.IS.pdf. 32 European Network and Security Agency. Actionable information for security incident response[EB/OL]. [2020-03-16]. https://www.enisa.europa.eu/topics/csirt-cert-services/reactive-services/copy_of_actionable-information. 33 Spasojevic N, Rao A. Identifying actionable messages on social media[C]// Proceedings of the 2015 IEEE International Conference on Big Data. New York: IEEE, 2015: 2273-2281. 34 Cao L B, Zhang C Q. The evolution of KDD: towards domain-driven data mining[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2007, 21(4): 677-692. 35 Garg H, Bansal C, Kaushal R, et al. Identifying actionable information from social media for better government-public relationship[C]// Proceedings of the 10th International Conference on Developments in eSystems Engineering. New York: IEEE, 2017: 206-211. 36 Tyshchuk Y, Wallace W A. Actionable information during extreme events—case study: warnings and 2011 Tohoku earthquake[C]// Proceedings of the 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Conference on Social Computing. New York: IEEE, 2012: 338-347. 37 Gerwin L E. The challenge of providing the public with actionable information during a pandemic[J]. The Journal of Law, Medicine & Ethics, 2012, 40(3): 630-654. 38 Pennington J, Socher R, Manning C D. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1532-1543. 39 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019, 1: 4171-4186. 40 Olteanu A, Vieweg S, Castillo C. What to expect when the unexpected happens: social media communications across crises[C]// Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. New York: ACM Press, 2015: 994-1009. 41 Imran M, Mitra P, Castillo C. Twitter as a lifeline: human-annotated Twitter corpora for NLP of crisis-related messages[C]// Proceedings of the Tenth International Conference on Language Resources and Evaluation. Paris: European Language Resources Association, 2016: 1638-1643. 42 Landis J R, Koch G G. The measurement of observer agreement for categorical data[J]. Biometrics, 1977, 33(1): 159-174. 43 Wu Y H, Schuster M, Chen Z F, et al. Google's neural machine translation system: bridging the gap between human and machine translation[OL]. (2016-10-08). https://arxiv.org/pdf/1609.08144v2. pdf. 44 周志华. 机器学习[M]. 北京: 清华大学出版社, 2016: 40-41. 45 Nadeau C, Bengio Y. Inference for the generalization error[J]. Machine Learning, 2003, 52(3): 239-281. 46 Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1995, 57(1): 289-300. 47 Tausczik Y R, Pennebaker J W. The psychological meaning of words: LIWC and computerized text analysis methods[J]. Journal of Language and Social Psychology, 2010, 29(1): 24-54. 48 Dutt R, Basu M, Ghosh K, et al. Utilizing microblogs for assisting post-disaster relief operations via matching resource needs and availabilities[J]. Information Processing & Management, 2019, 56(5): 1680-1697. |
|
|
|