摘要针对文本匹配在信息检索、文本挖掘等领域的广泛应用,本文提出一种具有良好泛化能力的深度交互文本匹配(deep interaction text matching,DITM)模型。基于匹配-聚合框架,DITM模型以编码层、共注意力层和融合层为交互模块,多次循环交互模块获取深层次的交互信息,经过多角度池化提取信息以预测文本对之间的关系。相比于基线方法,针对观点检索、答案挑选、释义识别和自然语言推理四个文本匹配任务,DITM模型在相应的数据集上均取得了最好的效果。本研究结果对于促进文本匹配模型在情报领域的实践具有重要意义。
余传明, 薛浩东, 江一帆. 基于深度交互的文本匹配模型研究[J]. 情报学报, 2021, 40(10): 1015-1026.
Yu Chuanming, Xue Haodong, Jiang Yifan. Research on Text Matching Model Based on Deep Interaction. 情报学报, 2021, 40(10): 1015-1026.
1 Yang R Q, Zhang J H, Gao X, et al. Simple and effective text matching with richer alignment features[C]// Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2019: 4699-4709. 2 范意兴, 郭嘉丰, 兰艳艳, 等. 基于上下文的深度语义句子检索模型[J]. 中文信息学报, 2017, 31(5): 156-162. 3 Ko B W, Choi H J. Twice fine-tuning deep neural networks for paraphrase identification[J]. Electronics Letters, 2020, 56(9): 444-447. 4 Al-Smadi M, Jaradat Z, Al-Ayyoub M, et al. Paraphrase identification and semantic text similarity analysis in Arabic news tweets using lexical, syntactic, and semantic features[J]. Information Processing & Management, 2017, 53(3): 640-652. 5 冯文政, 唐杰. 融合深度匹配特征的答案选择模型[J]. 中文信息学报, 2019, 33(1): 118-124. 6 吴少洪, 彭敦陆, 苑威威, 等. MGSC: 一种多粒度语义交叉的短文本语义匹配模型[J]. 小型微型计算机系统, 2019, 40(6): 1148-1152. 7 余传明, 郭亚静, 朱星宇, 等. 基于最大边界相关度的抽取式文本摘要模型研究[J]. 情报科学, 2021, 39(2): 34-43. 8 汤文兵, 任正云, 韩芳. 基于注意力机制的协同卷积动态推荐网络[J/OL]. 自动化学报. (2020-04-07) [2020-07-08]. https://doi.org/10.16383/j.aas.c190820. 9 沈思, 孙豪, 王东波. 基于深度学习表示的医学主题语义相似度计算及知识发现研究[J]. 情报理论与实践, 2020, 43(5): 183-190. 10 颜端武, 杨雄飞, 李铁军. 基于产品特征树和LSTM模型的产品评论情感分析[J]. 情报理论与实践, 2019, 42(12): 134-138. 11 倪维健, 郭浩宇, 刘彤, 等. 基于多头自注意力神经网络的购物篮推荐方法[J]. 数据分析与知识发现, 2020, 4(2/3): 68-77. 12 李伟卿, 王伟军. 基于大规模评论数据的产品特征词典构建方法研究[J]. 数据分析与知识发现, 2018, 2(1): 41-50. 13 章成志, 张颖怡. 基于学术论文全文的研究方法实体自动识别研究[J]. 情报学报, 2020, 39(6): 589-600. 14 林德明, 王宇开, 丁堃. 基于语义识别的知识产权战略政策工具选择[J]. 情报学报, 2020, 39(2): 178-185. 15 Griffiths T L, Steyvers M. Finding scientific topics[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(suppl 1): 5228-5235. 16 Huang P S, He X D, Gao J F, et al. Learning deep structured semantic models for web search using clickthrough data[C]// Proceedings of the 22nd ACM International Conference on Information & Knowledge Management. New York: ACM Press, 2013: 2333-2338. 17 Hu B T, Lu Z D, Li H, et al. Convolutional neural network architectures for matching natural language sentences[C]// Proceedings of the Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2014: 2042-2050. 18 Tan M, dos Santos C, Xiang B, et al. Improved representation learning for question answer matching[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 464-473. 19 Tai K S, Socher R, Manning C D. Improved semantic representations from tree-structured long short-term memory networks[C]// Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1556-1566. 20 Lu W P, Zhang X, Lu H M, et al. Deep hierarchical encoding model for sentence semantic matching[J]. Journal of Visual Communication and Image Representation, 2020, 71: 102794. 21 Peters M E, Neumann M, Iyyer M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018: 2227-2237. 22 Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[OL]. [2020-06-24]. https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018 improving.pdf. 23 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 24 Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]// Proceedings of the Advances in Neural Information Processing Systems. Cambridge: MIT Press, 2017: 5998-6008. 25 Liu Y H, Ott M, Goyal N, et al. RoBERTa: a robustly optimized BERT pretraining approach[OL]. (2019-07-26) [2020-06-17]. https://arxiv.org/pdf/1907.11692.pdf. 26 Reimers N, Gurevych I. Sentence-BERT: sentence embeddings using Siamese BERT-networks[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2019: 3982-3992. 27 Wan S X, Lan Y Y, Guo J F, et al. A deep architecture for semantic matching with multiple positional sentence representations[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2016: 2835-2841. 28 Pang L, Lan Y, Guo J, et al. Text matching as image recognition[C]// Proceedings of the 30th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2016: 2793-2799. 29 Yang L, Ai Q Y, Guo J F, et al. aNMM: ranking short answer texts with attention-based neural matching model[C]// Proceedings of the 25th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2016: 287-296. 30 Gong Y C, Luo H, Zhang J. Natural language inference over interaction space[OL]. (2018-05-26) [2020-04-02]. https://arxiv.org/pdf/1709.04348.pdf. 31 Wang S H, Jiang J. A compare-aggregate model for matching text sequences[OL]. (2016-11-06) [2020-06-17]. https://arxiv.org/pdf/1611.01747.pdf. 32 Wang Z G, Hamza W, Florian R. Bilateral multi-perspective matching for natural language sentences[C]// Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Press, 2017: 4144-4150. 33 Chen Q, Zhu X D, Ling Z H, et al. Enhanced LSTM for natural language inference[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1657-1668. 34 Tay Y, Luu A T, Hui S C. Compare, compress and propagate: enhancing neural architectures with alignment factorization for natural language inference[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2018: 1565-1575. 35 Kim S, Kang I, Kwak N. Semantic sentence matching with densely-connected recurrent and co-attentive information[C]// Proceedings of the 33th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2019, 33: 6586-6593. 36 Parikh A, T?ckstr?m O, Das D, et al. A decomposable attention model for natural language inference[C]// Proceedings of the Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 2249-2255. 37 Kolhatkar V, Wu H H, Cavasso L, et al. The SFU opinion and comments corpus: a corpus for the analysis of online news comments[J]. Corpus Pragmatics, 2020, 4: 155-190. 38 Yang Y, Yih W T, Meek C. WikiQA: a challenge dataset for open-domain question answering[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 2013-2018. 39 Khot T, Sabharwal A, and Clark P. SciTail: a textual entailment dataset from science question answering[C]// Proceedings of the 32th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 5189-5197. 40 Pennington J, Socher R, Manning C. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1532-1543. 41 Lan W W, Xu W. Neural network models for paraphrase identification, semantic textual similarity, natural language inference, and question answering[C]// Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 3890-3902. 42 Tay Y, Luu A T, Hui S C. Hermitian co-attention networks for text matching in asymmetrical domains[C]// Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. San Francisco: Morgan Kaufmann Press, 2018: 4425-4431.