摘要地方志作为中华文化的组成部分,是建设文化强国的重要一环,对其进行挖掘研究具有重要意义;同时,有效识别实体对地方志知识组织和知识图谱构建有着重要影响。当前地方志命名实体识别研究主要基于文本,缺乏文本对应的图片,而图片中的内容能够为识别文本中的实体提供额外的信息,从而提升模型识别实体的性能,并且实体识别还面临着已标注语料匮乏的问题。基于此,本文提出了利用深度迁移学习方法,结合地方志中的文本和图片进行多模态命名实体识别。首先,基于人民日报语料库和中文推特多模态数据集,分别预训练结合了自注意力机制的BiLSTM-attention-CRF模型和自适应联合注意力模型,利用基于神经网络的深度迁移学习方法将权重迁移至地方志多模态命名识别模型中,使模型获得提取文本和图片语义特征的能力;然后,结合过滤门对多模态融合特征去噪;最后,将融合后的多模态特征输入CRF(conditional random fields)层进行解码。本文将提出的模型在地方志多模态数据中进行了实证研究,并同相关基线模型作对比,实验结果表明,本文所提出的模型具有一定优势。
范涛, 王昊, 陈玥彤. 基于深度迁移学习的地方志多模态命名实体识别研究[J]. 情报学报, 2022, 41(4): 412-423.
Fan Tao, Wang Hao, Chen Yuetong. Research on Multimodal Named Entity Recognition of Local History Based on Deep Transfer Learning. 情报学报, 2022, 41(4): 412-423.
1 黄坤明. 推进社会主义文化强国建设[N]. 人民日报, 2020-11-23(6). 2 颜杰峰. 推进社会主义文化强国建设[J]. 红旗文稿, 2020(23): 34-36. 3 刘浏, 王东波. 命名实体识别研究综述[J]. 情报学报, 2018, 37(3): 329-340. 4 李娜. 基于条件随机场的方志古籍别名自动抽取模型构建[J]. 中文信息学报, 2018, 32(11): 41-48, 61. 5 黄水清, 王东波, 何琳. 基于先秦语料库的古汉语地名自动识别模型构建研究[J]. 图书情报工作, 2015, 59(12): 135-140. 6 Lee J Y, Dernoncourt F, Szolovits P. Transfer learning for named-entity recognition with neural networks[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association, 2018: 4470-4473. 7 王银瑞, 彭敦陆, 陈章, 等. Trans-NER: 一种迁移学习支持下的中文命名实体识别模型[J]. 小型微型计算机系统, 2019, 40(8): 1622-1626. 8 Zhang Q, Fu J L, Liu X Y, et al. Adaptive co-attention network for named entity recognition in Tweets[C]// Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5674-5681. 9 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究[J]. 数据分析与知识发现, 2020, 4(8): 86-97. 10 崔竞烽, 郑德俊, 王东波, 等. 基于深度学习模型的菊花古典诗词命名实体识别[J]. 情报理论与实践, 2020, 43(11): 150-155. 11 唐慧慧, 王昊, 张紫玄, 等. 基于汉字标注的中文历史事件名抽取研究[J]. 数据分析与知识发现, 2018, 2(7): 89-100. 12 殷章志, 李欣子, 黄德根, 等. 融合字词模型的中文命名实体识别研究[J]. 中文信息学报, 2019, 33(11): 95-100, 106. 13 石春丹, 秦岭. 基于BGRU-CRF的中文命名实体识别方法[J]. 计算机科学, 2019, 46(9): 237-242. 14 Yu J F, Jiang J, Yang L, et al. Improving multimodal named entity recognition via entity span detection with unified multimodal transformer[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2020: 3342-3352. 15 Lu D, Neves L, Carvalho V, et al. Visual attention model for name tagging in multimodal social media[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 1990-1999. 16 Tan C Q, Sun F C, Kong T, et al. A survey on deep transfer learning[C]// Proceedings of the 27th International Conference on Artificial Neural Networks. Cham: Springer, 2018: 270-279. 17 武惠, 吕立, 于碧辉. 基于迁移学习和BiLSTM-CRF的中文命名实体识别[J]. 小型微型计算机系统, 2019, 40(6): 1142-1147. 18 Li S, Zhao Z, Hu R F, et al. Analogical reasoning on Chinese morphological and semantic relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 138-143. 19 Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 20 Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[OL]. (2015-04-10). https://arxiv.org/pdf/1409.1556.pdf. 21 Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009: 248-255. 22 孙凌浩. 利用翻译模型的跨语言中文命名实体识别[J]. 计算机工程与应用, 2021, 57(10): 94-100. 23 Carvalho T, de Rezende E R S, Alves M T P, et al. Exposing computer generated images by eye’s region classification via transfer learning of VGG19 CNN[C]// Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications. IEEE, 2017: 866-870. 24 Tjong Kim Sang E F, Veenstra J. Representing text chunks[C]// Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 1999: 173-179. 25 Rei M, Crichton G, Pyysalo S. Attending to characters in neural sequence labeling models[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 2016: 309-318. 26 Huang Z H, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[OL]. (2015-08-09). https://arxiv.org/pdf/1508.01991.pdf. 27 Limsopatham N, Collier N. Bidirectional LSTM for named entity recognition in Twitter messages[C]// Proceedings of the 2nd Workshop on Noisy User-generated Text. The COLING 2016 Organizing Committee, 2016: 145-152. 28 Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 282-289. 29 Che W X, Li Z H, Liu T. LTP: a Chinese language technology platform[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Stroudsburg: Association for Computational Linguistics, 2010: 13-16.