|
|
Research on Multimodal Named Entity Recognition of Local History Based on Deep Transfer Learning |
Fan Tao, Wang Hao, Chen Yuetong |
School of Information Management, Nanjing University, Nanjing 210023 |
|
|
Abstract Local history, as a part of Chinese culture, is important in constructing culture in China, making it meaningful to examine local history. Recognizing entities has a great impact on knowledge organization and the construction of knowledge graphs. Researches on the Named Entity Recognition (NER) of local history are based on texts but lacking the combination of texts and images. Images can provide extra information to recognize the entity, increasing the performance of the NER model. Additionally, the NER model still lacks annotated corpus. Thus, in this study we proposed a multimodal NER model based on deep transfer learning, combining texts and images. First, we pretrained BiLSTM-attention-CRF and adaptive co-attention model based on Renmin Daily corpus and Chinese Twitter multimodal data. Then, we used a method based on deep neural network to transfer the weights in pretrained models to the proposed model, making the proposed multimodal NER model able to capture the semantics features in texts and images. A filter gate was applied to filter the information from the multimodal features. Lastly, a CRF layer was adopted to decode the fused multimodal features, outputting the labels. The proposed multimodal NER model was evaluated on the local history multimodal dataset and compared with baseline methods. Experimental results showed the superiority of the proposed model.
|
Received: 18 February 2021
|
|
|
|
1 黄坤明. 推进社会主义文化强国建设[N]. 人民日报, 2020-11-23(6). 2 颜杰峰. 推进社会主义文化强国建设[J]. 红旗文稿, 2020(23): 34-36. 3 刘浏, 王东波. 命名实体识别研究综述[J]. 情报学报, 2018, 37(3): 329-340. 4 李娜. 基于条件随机场的方志古籍别名自动抽取模型构建[J]. 中文信息学报, 2018, 32(11): 41-48, 61. 5 黄水清, 王东波, 何琳. 基于先秦语料库的古汉语地名自动识别模型构建研究[J]. 图书情报工作, 2015, 59(12): 135-140. 6 Lee J Y, Dernoncourt F, Szolovits P. Transfer learning for named-entity recognition with neural networks[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association, 2018: 4470-4473. 7 王银瑞, 彭敦陆, 陈章, 等. Trans-NER: 一种迁移学习支持下的中文命名实体识别模型[J]. 小型微型计算机系统, 2019, 40(8): 1622-1626. 8 Zhang Q, Fu J L, Liu X Y, et al. Adaptive co-attention network for named entity recognition in Tweets[C]// Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5674-5681. 9 徐晨飞, 叶海影, 包平. 基于深度学习的方志物产资料实体自动识别模型构建研究[J]. 数据分析与知识发现, 2020, 4(8): 86-97. 10 崔竞烽, 郑德俊, 王东波, 等. 基于深度学习模型的菊花古典诗词命名实体识别[J]. 情报理论与实践, 2020, 43(11): 150-155. 11 唐慧慧, 王昊, 张紫玄, 等. 基于汉字标注的中文历史事件名抽取研究[J]. 数据分析与知识发现, 2018, 2(7): 89-100. 12 殷章志, 李欣子, 黄德根, 等. 融合字词模型的中文命名实体识别研究[J]. 中文信息学报, 2019, 33(11): 95-100, 106. 13 石春丹, 秦岭. 基于BGRU-CRF的中文命名实体识别方法[J]. 计算机科学, 2019, 46(9): 237-242. 14 Yu J F, Jiang J, Yang L, et al. Improving multimodal named entity recognition via entity span detection with unified multimodal transformer[C]// Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2020: 3342-3352. 15 Lu D, Neves L, Carvalho V, et al. Visual attention model for name tagging in multimodal social media[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 1990-1999. 16 Tan C Q, Sun F C, Kong T, et al. A survey on deep transfer learning[C]// Proceedings of the 27th International Conference on Artificial Neural Networks. Cham: Springer, 2018: 270-279. 17 武惠, 吕立, 于碧辉. 基于迁移学习和BiLSTM-CRF的中文命名实体识别[J]. 小型微型计算机系统, 2019, 40(6): 1142-1147. 18 Li S, Zhao Z, Hu R F, et al. Analogical reasoning on Chinese morphological and semantic relations[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 138-143. 19 Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 2017, 60(6): 84-90. 20 Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[OL]. (2015-04-10). https://arxiv.org/pdf/1409.1556.pdf. 21 Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009: 248-255. 22 孙凌浩. 利用翻译模型的跨语言中文命名实体识别[J]. 计算机工程与应用, 2021, 57(10): 94-100. 23 Carvalho T, de Rezende E R S, Alves M T P, et al. Exposing computer generated images by eye’s region classification via transfer learning of VGG19 CNN[C]// Proceedings of the 2017 16th IEEE International Conference on Machine Learning and Applications. IEEE, 2017: 866-870. 24 Tjong Kim Sang E F, Veenstra J. Representing text chunks[C]// Proceedings of the Ninth Conference on European Chapter of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 1999: 173-179. 25 Rei M, Crichton G, Pyysalo S. Attending to characters in neural sequence labeling models[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics. The COLING 2016 Organizing Committee, 2016: 309-318. 26 Huang Z H, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[OL]. (2015-08-09). https://arxiv.org/pdf/1508.01991.pdf. 27 Limsopatham N, Collier N. Bidirectional LSTM for named entity recognition in Twitter messages[C]// Proceedings of the 2nd Workshop on Noisy User-generated Text. The COLING 2016 Organizing Committee, 2016: 145-152. 28 Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 2001: 282-289. 29 Che W X, Li Z H, Liu T. LTP: a Chinese language technology platform[C]// Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations. Stroudsburg: Association for Computational Linguistics, 2010: 13-16. |
|
|
|