|
|
Abstractive Summarization Based on Sequence to Sequence Models: A Review |
Shi Lei1,Ruan Xuanmin2,Wei Ruibin1,Cheng Ying23 |
1. School of Management Science and Engineering, Anhui University of Finance and Economics, Bengbu 233030; 2. School of Information Management, Nanjing University, Nanjing 210023; 3. School of Chinese Language and Literature, Shandong Normal University, Jinan 250014 |
|
|
Abstract ive Summarization Based on Sequence to Sequence Models: A ReviewShi Lei1, Ruan Xuanmin2, Wei Ruibin1 and Cheng Ying2,3(1. School of Management Science and Engineering, Anhui University of Finance and Economics, Bengbu 233030; 2. School of Information Management, Nanjing University, Nanjing 210023; 3. School of Chinese Language and Literature, Shandong Normal University, Jinan 250014)Compared with the early abstractive summarization method, the text summarization method based on Sequence to Sequence models is much closer to the process of human-written summaries, and the quality of the generated summary has also been significantly improved, which has attracted increasing attention from the academic community. This paper reviews the research related to abstractive summarization based on Sequence to Sequence models in recent years. According to the structure of the model, this paper summarizes the research on the model in terms of encoding, decoding, training, and so on, and it compares and discusses these works. On this basis, some technical routes and development directions for future research in this field are put forward.
|
Received: 18 January 2019
|
|
|
|
1 LuhnH P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165. 2 GambhirM, GuptaV. Recent automatic text summarization techniques: A survey[J]. Artificial Intelligence Review, 2017, 47(1): 1-66. 3 刘家益, 邹益民. 近70年文本自动摘要研究综述[J]. 情报科学, 2017, 35(7): 154-161. 4 OwczarzakK, DangH T. Overview of the TAC 2011 summarization track: Guided task and aesop task[C]//Proceedings of the Fourth Text Analysis Conference, Gaithersburg, Maryland, USA, 2011. 5 GenestP E, LapalmeG. Fully abstractive approach to guided summarization[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2012, 2: 354-358. 6 KhanA, SalimN. A review on abstractive summarization methods[J]. Journal of Theoretical and Applied Information Technology, 2014, 59(1): 64-72. 7 ChoK, van Merri?nboerB, GulcehreC, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1724-1734. 8 SutskeverI, VinyalsO, LeQ V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 3104-3112. 9 RushA M, ChopraS, WestonJ. A neural attention model for abstractive sentence summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 379-389. 10 JingH Y. Using hidden Markov modeling to decompose human-written summaries[J]. Computational Linguistics, 2002, 28(4): 527-543. 11 ShiT, KeneshlooY, RamakrishnanN, et al. Neural abstractive text summarization with sequence-to-sequence models[OL]. (2018-12-7) [2018-12-15]. https://arxiv.org/abs/1812.02303. 12 BahdanauD, ChoK, BengioY. Neural machine translation by jointly learning to align and translate[OL]. (2015-4-24) [2018-10-15]. https://arxiv.org/abs/1409.0473v6. 13 ZhouQ Y, YangN, WeiF R, et al. Selective encoding for abstractive sentence summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1095-1104. 14 ZengW Y, LuoW J, FidlerS, et al. Efficient summarization with read-again and copy mechanism[OL]. (2016-11-10) [2018-9-15]. https://arxiv.org/abs/1611.03382. 15 LinJ Y, SunX, MaS M, et al. Global encoding for abstractive summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 163-169. 16 VaswaniA , ShazeerN , ParmarN , et al. Attention is all you need[C]// Proceedings of the Conference on Advances in Neural Information Processing Systems, Long Beach, USA, 2017: 6000-6010. 17 NallapatiR, ZhouB W, dos SantosC, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 280-290. 18 CelikyilmazA, BosselutA, HeX D, et al. Deep communicating agents for abstractive summarization[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 1: 1662-1675. 19 CohanA, DernoncourtF, KimD S, et al. A discourse-aware attention model for abstractive summarization of long documents[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 2: 615-621. 20 LingJ, RushA. Coarse-to-fine attention models for document summarization[C]//Proceedings of the Workshop on New Frontiers in Summarization. Stroudsburg: Association for Computational Linguistics, 2017: 33-42. 21 王帅, 赵翔, 李博, 等. TP-AS: 一种面向长文本的两阶段自动摘要方法[J]. 中文信息学报, 2018, 32(6): 71-79. 22 MihalceaR, TarauP. Textrank: Bringing order into text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 2004: 404-411. 23 XieN T, LiS J, RenH L, et al. Abstractive summarization improved by WordNet-based extractive sentences[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018, 11108: 404-415. 24 ChenY C, BansalM. Fast abstractive summarization with reinforce-selected sentence rewriting[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 675-686. 25 ChenY B, MaY, MaoX D, et al. Abstractive summarization with the aid of extractive summarization[C]//Proceedings of Asia-Pacific Web and Web-Age Information Management Joint International Conference on Web and Big Data. Cham: Springer, 2018, 10987: 3-15. 26 HsuW T, LinC K, LeeM Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[OL]. (2018-7-5) [2018-9-15]. https://arxiv.org/abs/1805.06266. 27 TanJ W, WanX J, XiaoJ G, et al. Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1171-1181. 28 LiC L, XuW R, LiS, et al. Guiding generation for abstractive text summarization based on key information guide network[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018: 55-60. 29 JiangX P, HuP, HouL W, et al. Improving pointer-generator network with keywords information for Chinese abstractive summarization[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018: 464-474. 30 侯丽微, 胡珀, 曹雯琳. 主题关键词信息融合的中文生成式自动摘要研究[J]. 自动化学报, 2019, 45(3): 530-539. 31 CaoZ Q, WeiF R, LiW J, et al. Faithful to the original: Fact aware neural abstractive summarization[L]. (2017-11-13) [2018-10-15]. https://arxiv.org/abs/1711.04434v1. 32 ChopraS, AuliM, RushA M. Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2016: 93-98. 33 GehringJ, AuliM, GrangierD, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning. PMLR, 2017, 70: 1243-1252. 34 周健, 田萱, 崔晓晖. 基于改进Sequence-to-Sequence模型的文本摘要生成方法[J]. 计算机工程与应用, 2019, 55(1): 128-134. 35 SongS L, HuangH T, RuanT X. Abstractive text summarization using LSTM-CNN based deep learning[J]. Multimedia Tools and Applications, 2019, 78(1): 857-875. 36 GulcehreC, AhnS, NallapatiR, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 140-149. 37 VinyalsO, FortunatoM, JaitlyN. Pointer networks[C]//Proceedings of the Conference on Advances in Neural Information Processing Systems, Montreal, Canada, 2015: 2692-2700. 38 SeeA, LiuP J, ManningC D. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1073-1083. 39 ZhouQ, YangN, WeiF, et al. Sequential copying networks[C]// Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4987-4995. 40 GuJ T, LuZ D, LiH, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1631-1640. 41 TuZ P, LuZ D, LiuY, et al. Modeling coverage for neural machine translation[OL]. (2016-8-6) [2018-10-15]. https://arxiv.org/abs/1601.04811. 42 SankaranB, MiH T, Al-OnaizanY, et al. Temporal attention model for neural machine translation[OL]. (2016-8-9) [2018-9-15]. https://arxiv.org/abs/1608.02927. 43 PaulusR, XiongC M, SocherR. A deep reinforced model for abstractive summarization[OL]. (2017-11-13) [2018-9-15]. https://arxiv.org/abs/1705.04304. 44 LiP J, LamW, BingL D, et al. Deep recurrent generative decoder for abstractive text summarization[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 2091-2100. 45 KingmaD P, WellingM. Auto encoding variational bayes[OL]. (2014-5-1) [2018-9-15]. https://arxiv.org/abs/1312.6114. 46 BowmanS R, VilnisL, VinyalsO, et al. Generating sentences from a continuous space[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 10-21. 47 GregorK, DanihelkaI, GravesA, et al. Draw: A recurrent neural network for image generation[C]//Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2015, 37: 1462-1471. 48 Kry?cińskiW, PaulusR, XiongC M, et al. Improving abstraction in text summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2018: 1808-1817. 49 AmplayoR K, LimS, HwangS. Entity commonsense representation for neural abstractive summarization[OL]. (2018-6-14) [2018-10-15]. https://arxiv.org/abs/1806.05504. 50 CibilsA, MusatC, HossmannA, et al. Diverse beam search for increased novelty in abstractive summarization[OL]. (2018-2-5) [2018-9-15]. https://arxiv.org/abs/1802.01457. 51 Ayana, ShenS Q, ZhaoY, et al. Neural headline generation with sentence-wise optimization[OL]. (2016-10-9) [2018-9-15]. https://arxiv.org/abs/1604.01904. 52 RanzatoM, ChopraS, AuliM, et al. Sequence level training with recurrent neural networks[OL]. (2015-11-20) [2018-10-15]. https://arxiv.org/abs/1511.06732. 53 RennieS J, MarcheretE, MrouehY, et al. Self-critical sequence training for image captioning[OL]. (2016-12-2) [2018-9-15]. https://arxiv.org/abs/1612.00563. 54 LiP J, BingL D, LamW. Actor-critic based training framework for abstractive summarization[OL]. (2018-8-15) [2018-9-15]. https://arxiv.org/abs/1803.11070. 55 KondaV R, TsitsiklisJ N. Actor-critic algorithms[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems, Denver, USA, 2000: 1008-1014. 56 ShiY S, MengJ, WangJ, et al. A normalized encoder-decoder model for abstractive summarization using focal loss[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018, 11109s: 383-392. 57 LinT Y, GoyalP, GirshickR, et al. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision. IEEE, 2017: 2999-3007. 58 HuaX Y, WangL. A pilot study of domain adaptation effect for neural abstractive summarization[OL]. (2017-7-21) [2018-10-15]. https://arxiv.org/abs/1707.07062. 59 FanA, GrangierD, AuliM. Controllable abstractive summarization[OL]. (2018-5-18) [2018-9-15]. https://arxiv.org/abs/1711.05217. 60 ZhangJ M, TanJ W, WanX J. Towards a neural network approach to abstractive multi-document summarization[OL]. (2018-4-24) [2018-9-15]. https://arxiv.org/abs/1804.09010. 61 ChuE, LiuP J. Unsupervised neural multi-document abstractive summarization[OL]. (2018-10-12) [2018-10-15]. https://arxiv.org/abs/1810.05739. 62 NemaP, KhapraM M, LahaA, et al. Diversity driven attention model for query-based abstractive summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1063-1072. 63 HasselqvistJ, HelmertzN, Kageback, M. Query-based abstractive summarization using neural networks[OL]. (2017-12-17) [2018-9-15]. https://arxiv.org/abs/1712.06100. 64 BaumelT, EyalM, ElhadadM. Query focused abstractive summarization: Incorporating query relevance, multi-document coverage, and summary length constraints into seq2seq models[OL]. (2018-1-23) [2018-9-15]. https://arxiv.org/abs/1801.07704. 65 PasunuruR, GuoH, BansalM. Towards improving abstractive summarization via entailment generation[C]//Proceedings of the Workshop on New Frontiers in Summarization. Stroudsburg: Association for Computational Linguistics, 2017: 27-32. 66 PasunuruR, BansalM. Multi-reward reinforced summarization with saliency and entailment[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 2: 646-653. 67 GuoH, PasunuruR, BansalM. Soft layer-specific multi-task summarization with entailment and question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 687-697. 68 WangL, YaoJ L, TaoY Z, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[OL]. (2018-6-2) [2018-10-15]. https://arxiv.org/abs/1805.03616. 69 SongK Q, ZhaoL, LiuF. Structure-infused copy mechanisms for abstractive summarization[C]//Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistic, 2018: 1717-1729. 70 HuB T, ChenQ C, ZhuF Z. LCSTS: A large scale Chinese short text summarization dataset[OL]. (2016-9-19) [2018-9-15]. https://arxiv.org/abs/1506.05865. 71 王毅, 谢娟, 成颖. 结合LSTM和CNN混合架构的深度神经网络语言模型[J]. 情报学报, 2018, 37(2): 194-205. 72 HuD C. An introductory survey on attention mechanisms in nlp problems[OL]. (2018-11-12) [2018-12-15]. https://arxiv.org/abs/1811.05544. 73 LinC Y, HovyE. Automatic evaluation of summaries using n-gram co-occurrence statistics[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg: Association for Computational Linguistics, 2003, 1: 71-78. 74 吴飞, 阳春华, 兰旭光, 等. 人工智能的回顾与展望[J]. 中国科学基金, 2018, 32(3): 243-250. 75 李亚超, 熊德意, 张民. 神经机器翻译综述[J]. 计算机学报, 2018, 41(12): 2734-2755. |
|
|
|