|
|
|
| Deep Learning-Driven Multilabel Classification and Semantic Relationship Mapping of Disaster Entities in Chinese Meteorological Historical Materials |
| Hu Zewen, Xie Shaoke |
| School of Management Science and Engineering, Nanjing University of Information Science & Technology, Nanjing 210044 |
|
|
|
|
Abstract This study aimed to address the automatic identification, multilabel classification, knowledge graph construction, and cyclical dynamic evolution of disaster entities and their semantic relationships in Chinese meteorological historical materials. Meteorological historical data from the early Ming Dynasty, drawn from the Chinese Three Thousand Years of Meteorological Records Collection, were used as the study sample. The BERT-RCNN (bidirectional encoder representations from transformers - recurrent convolutional neural networks) deep learning model was applied to extract and classify disaster entities and to perform multilabel classification of disaster records in the historical data. Based on this process, structured data were generated and used to construct the schema layer of semantic relational mapping of historical meteorological hazard entities. The integrated use of the Neo4j graph database and Gephi visualization tools for map visualization and cyclical evolution analysis revealed the temporal changes and spatial distribution characteristics of meteorological hazards in the early Ming Dynasty. The results showed that the BERT-RCNN model demonstrated significant performance advantages in the automatic identification and multilabel classification of meteorological disaster entities in Chinese meteorological historical data, achieving a mean micro-F1 classification precision and disaster record recall of up to 96%. Differences in recognition and multilabel classification performance were observed among different disaster entities. The model exhibited optimal recognition and classification performance for disaster entities such as floods and droughts, which occurred more frequently in the early Ming Dynasty, while recognition and identification performance was poorer for a small number of disaster types with lower occurrence frequencies. The semantic relationship mapping of historical meteorological disaster entities, constructed on the basis of structured data obtained after multilabel classification, clearly revealed the temporal and spatial variation characteristics of different types of disaster entities in Chinese meteorological historical data. From a temporal perspective, the frequency of meteorological disasters in the early Ming Dynasty in China exhibited an overall trend of a slight decrease, followed by a gradual increase, and then a sharp increase in the final stage. The main disaster types affecting societal development in the early Ming Dynasty in China were floods, droughts, and locusts. With respect to spatial distribution, droughts and floods exhibited higher frequencies and greater spatial concentration in the middle and lower reaches of the Yangtze River and the lower reaches of the Yellow River.
|
|
Received: 31 May 2025
|
|
|
|
1 肖媚媚. 政策驱动的古籍保护与数字人文学思考[J]. 图书与情报, 2022(2): 122-126. 2 中共中央办公厅 国务院办公厅印发《关于推进新时代古籍工作的意见》[EB/OL]. (2022-04-11). https://www.gov.cn/gongbao/content/2022/content_5687500.htm. 3 范颜铄, 周晓英, 王克平, 等. 融合GPT技术和用户需求的文学类古籍资源关联数据发布研究——以《聊斋志异·司文郎》为例[J]. 现代情报, 2024, 44(10): 154-167. 4 高丹, 徐健, 何琳, 等. 数字人文视域下中国古代农耕图像知识组织研究[J]. 图书馆杂志, 2024, 43(1): 109-117. 5 欧阳剑. 面向数字人文研究的大规模古籍文本可视化分析与挖掘[J]. 中国图书馆学报, 2016, 42(2): 66-80. 6 徐近之. 黄淮平原气候历史记载的初步整理[J]. 地理学报, 1955, 21(2): 181-190. 7 徐近之. 四川省气候历史记载初步整理[M]. 南京: 南京地理研究所, 1962. 8 许乾坤, 王东波, 刘禹彤, 等. 面向知识服务的古籍知识库构建研究[J]. 情报科学, 2024, 42(5): 149-158. 9 夏明方. 大数据与生态史: 中国灾害史料整理与数据库建设[J]. 清史研究, 2015(2): 67-82. 10 Wu Y P, Lin H T. Progressive random k-labelsets for cost-sensitive multi-label classification[J]. Machine Learning, 2017, 106(5): 671-694. 11 Guo L, Jin B, Yu R Y, et al. Multi-label classification methods for green computing and application for mobile medical recommendations[J]. IEEE Access, 2016, 4: 3201-3209. 12 Liu S M, Chen J H. A multi-label classification based approach for sentiment classification[J]. Expert Systems with Applications, 2015, 42(3): 1083-1093. 13 Szarmach M, Czarnowski I. Multi-label classification for AIS data anomaly detection using wavelet transform[J]. IEEE Access, 2022, 10: 109119-109131. 14 Deniz E, Erbay H, Co?ar M. Multi-label classification of e-commerce customer reviews via machine learning[J]. Axioms, 2022, 11(9): 436. 15 郝超, 裘杭萍, 孙毅, 等. 多标签文本分类研究进展[J]. 计算机工程与应用, 2021, 57(10): 48-56. 16 Yang Z, Emmert-Streib F. Threshold-learned CNN for multi-label text classification of electronic health records[J]. IEEE Access, 2023, 11: 93402-93419. 17 Wang R, Ridley R, Su X A, et al. A novel reasoning mechanism for multi-label text classification[J]. Information Processing & Management, 2021, 58(2): 102441. 18 Zhang X L, Tan X F, Luo Z C, et al. Multi-label sequence generating model via label semantic attention mechanism[J]. International Journal of Machine Learning and Cybernetics, 2023, 14(5): 1711-1723. 19 Yang Z Y, Liu G J. Hierarchical sequence-to-sequence model for multi-label text classification[J]. IEEE Access, 2019, 7: 153012-153020. 20 Bani-Almarjeh M, Kurdy M B. Arabic abstractive text summarization using RNN-based and transformer-based architectures[J]. Information Processing & Management, 2023, 60(2): 103227. 21 Miao J W, Tao H C, Xie H R, et al. Reconstruction-based anomaly detection for multivariate time series using contrastive generative adversarial networks[J]. Information Processing & Management, 2024, 61(1): 103569. 22 王志宇, 刘雨薇. 基于政务微博的自然灾害知识图谱构建——以森林火灾为例[J]. 现代情报, 2024, 44(3): 47-58, 119. 23 魏明珠, 郑荣, 高志豪, 等. 融合知识图谱和深度神经网络的产业新兴技术预测模型研究[J]. 情报学报, 2022, 41(11): 1134-1148. 24 Lu Y H, Tong X Y, Xiong X, et al. Knowledge graph enhanced citation recommendation model for patent examiners[J]. Scientometrics, 2024, 129(4): 2181-2203. 25 王丹, 张海涛, 刘嫣, 等. 全景生态视角的微博舆情多维图谱构建研究[J]. 情报学报, 2019, 38(12): 1275-1285. 26 Lackner A, Fathalla S, Nayyeri M, et al. Analysing the evolution of computer science events leveraging a scholarly knowledge graph: a scientometrics study of top-ranked events in the past decade[J]. Scientometrics, 2021, 126(9): 8129-8151. 27 沈旺, 于琳, 冯欣, 等. 基于动态知识图谱的中医疫病古籍知识演化研究[J]. 现代情报, 2025, 45(2): 26-37. 28 支凤稳, 陈佳琪, 孙若阳, 等. 基于知识图谱的数字记忆领域知识发现研究[J]. 情报科学, 2024, 42(9): 123-134. 29 郭佳欣, 马昭仪, 肖天意, 等. 《长安十二时辰》对唐长安城市空间的当代重构——一种文学制图的视角[J]. 数字人文研究, 2021, 1(2): 9-20. 30 金菊良, 陈鹏飞, 陈梦璐, 等. 基于知识图谱的自然灾害风险防控研究进展的文献计量分析[J]. 灾害学, 2019, 34(2): 145-152. 31 张芙颖, 顾鑫炳, 彭毅, 等. 中国灾害风险认知研究的知识图谱分析[J]. 安全与环境工程, 2019, 26(2): 32-37. 32 罗军华, 林孝松, 牟凤云, 等. 基于CiteSpace的中国公路洪灾研究知识图谱分析[J]. 科学技术与工程, 2020, 20(16): 6362-6368. 33 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 34 胡泽文, 王梦雅, 韩雅蓉. 基于LDA2Vec-BERT的新兴技术主题多维指标识别与演化分析研究——以颠覆性技术领域: 区块链为例[J]. 现代情报, 2024, 44(9): 42-58. 35 Lai S W, Xu L H, Liu K, et al. Recurrent convolutional neural networks for text classification[C]// Proceedings of the 29th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2015: 2267-2273. 36 史政一, 吕君可, 黄弘. 基于Neo4j的城市地下管道信息知识图谱构建研究[J]. 中国安全生产科学技术, 2024, 20(6): 5-10. 37 Paulheim H. Knowledge graph refinement: a survey of approaches and evaluation methods[J]. Semantic Web, 2016, 8(3): 489-508. 38 胡段牧, 袁武, 牛方曲, 等. 中文文本蕴含气象灾害事件信息多模型融合抽取方法[J]. 地球信息科学学报, 2022, 24(12): 2342-2355. 39 张宝军, 马玉玲, 李仪. 我国自然灾害分类的标准化[J]. 自然灾害学报, 2013, 22(5): 8-12. 40 刘烨, 刘仕鑫, 曾雪强, 等. 融合Emoji情感分布的多标签情绪识别方法[J]. 中文信息学报, 2024, 38(4): 120-133. 41 陆佳丽. 基于BERT-TextCNN的开源威胁情报文本的多标签分类方法[J]. 信息安全研究, 2024, 10(8): 760-768. 42 章成志, 李卓, 储荷婷. 基于全文内容的学术论文研究方法自动分类研究[J]. 情报学报, 2020, 39(8): 852-862. 43 沈思, 朱雨菲. 面向学术全文本多维知识元的学术图谱构建研究[J]. 情报学报, 2024, 43(8): 960-975. 44 赵景波, 马莉. 明代陕南地区洪涝灾害研究[J]. 地球科学与环境学报, 2009, 31(2): 207-211. 45 李富民, 殷淑燕, 殷田园. 明代晋陕蒙地区蝗灾的韵律性及其与气候变化关系[J]. 干旱区资源与环境, 2019, 33(11): 176-183. 46 鞠明库. 明代的自然灾害及其社会影响[J]. 江西社会科学, 2007(7): 20-23. 47 张人权, 梁杏, 万军伟. 历史时期长江中游河道演变与洪灾发展的规律[J]. 水文地质工程地质, 2003, 30(4): 26-30. 48 田冰. 明代黄河水患与治黄保漕时空变迁述论[J]. 郑州大学学报(哲学社会科学版), 2015, 48(5): 133-137. 49 沈雅珉. 明清江苏气象灾害特征及影响研究[D]. 南京: 南京信息工程大学, 2017. 50 陈芳芳. 数智技术驱动下公共图书馆古籍数字化建设研究[J]. 图书馆工作与研究, 2024(6): 52-58, 80. 责任编辑 冯家琪) |
|
|
|