Abstractive Summarization Based on Sequence to Sequence Models: A Review
Shi Lei1,Ruan Xuanmin2,Wei Ruibin1,Cheng Ying23
1. School of Management Science and Engineering, Anhui University of Finance and Economics, Bengbu 233030; 2. School of Information Management, Nanjing University, Nanjing 210023; 3. School of Chinese Language and Literature, Shandong Normal University, Jinan 250014
1 LuhnH P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165. 2 GambhirM, GuptaV. Recent automatic text summarization techniques: A survey[J]. Artificial Intelligence Review, 2017, 47(1): 1-66. 3 刘家益, 邹益民. 近70年文本自动摘要研究综述[J]. 情报科学, 2017, 35(7): 154-161. 4 OwczarzakK, DangH T. Overview of the TAC 2011 summarization track: Guided task and aesop task[C]//Proceedings of the Fourth Text Analysis Conference, Gaithersburg, Maryland, USA, 2011. 5 GenestP E, LapalmeG. Fully abstractive approach to guided summarization[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2012, 2: 354-358. 6 KhanA, SalimN. A review on abstractive summarization methods[J]. Journal of Theoretical and Applied Information Technology, 2014, 59(1): 64-72. 7 ChoK, van Merri?nboerB, GulcehreC, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1724-1734. 8 SutskeverI, VinyalsO, LeQ V. Sequence to sequence learning with neural networks[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2014: 3104-3112. 9 RushA M, ChopraS, WestonJ. A neural attention model for abstractive sentence summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 379-389. 10 JingH Y. Using hidden Markov modeling to decompose human-written summaries[J]. Computational Linguistics, 2002, 28(4): 527-543. 11 ShiT, KeneshlooY, RamakrishnanN, et al. Neural abstractive text summarization with sequence-to-sequence models[OL]. (2018-12-7) [2018-12-15]. https://arxiv.org/abs/1812.02303. 12 BahdanauD, ChoK, BengioY. Neural machine translation by jointly learning to align and translate[OL]. (2015-4-24) [2018-10-15]. https://arxiv.org/abs/1409.0473v6. 13 ZhouQ Y, YangN, WeiF R, et al. Selective encoding for abstractive sentence summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1095-1104. 14 ZengW Y, LuoW J, FidlerS, et al. Efficient summarization with read-again and copy mechanism[OL]. (2016-11-10) [2018-9-15]. https://arxiv.org/abs/1611.03382. 15 LinJ Y, SunX, MaS M, et al. Global encoding for abstractive summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 163-169. 16 VaswaniA , ShazeerN , ParmarN , et al. Attention is all you need[C]// Proceedings of the Conference on Advances in Neural Information Processing Systems, Long Beach, USA, 2017: 6000-6010. 17 NallapatiR, ZhouB W, dos SantosC, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 280-290. 18 CelikyilmazA, BosselutA, HeX D, et al. Deep communicating agents for abstractive summarization[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 1: 1662-1675. 19 CohanA, DernoncourtF, KimD S, et al. A discourse-aware attention model for abstractive summarization of long documents[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 2: 615-621. 20 LingJ, RushA. Coarse-to-fine attention models for document summarization[C]//Proceedings of the Workshop on New Frontiers in Summarization. Stroudsburg: Association for Computational Linguistics, 2017: 33-42. 21 王帅, 赵翔, 李博, 等. TP-AS: 一种面向长文本的两阶段自动摘要方法[J]. 中文信息学报, 2018, 32(6): 71-79. 22 MihalceaR, TarauP. Textrank: Bringing order into text[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, 2004: 404-411. 23 XieN T, LiS J, RenH L, et al. Abstractive summarization improved by WordNet-based extractive sentences[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018, 11108: 404-415. 24 ChenY C, BansalM. Fast abstractive summarization with reinforce-selected sentence rewriting[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 675-686. 25 ChenY B, MaY, MaoX D, et al. Abstractive summarization with the aid of extractive summarization[C]//Proceedings of Asia-Pacific Web and Web-Age Information Management Joint International Conference on Web and Big Data. Cham: Springer, 2018, 10987: 3-15. 26 HsuW T, LinC K, LeeM Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[OL]. (2018-7-5) [2018-9-15]. https://arxiv.org/abs/1805.06266. 27 TanJ W, WanX J, XiaoJ G, et al. Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1171-1181. 28 LiC L, XuW R, LiS, et al. Guiding generation for abstractive text summarization based on key information guide network[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018: 55-60. 29 JiangX P, HuP, HouL W, et al. Improving pointer-generator network with keywords information for Chinese abstractive summarization[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018: 464-474. 30 侯丽微, 胡珀, 曹雯琳. 主题关键词信息融合的中文生成式自动摘要研究[J]. 自动化学报, 2019, 45(3): 530-539. 31 CaoZ Q, WeiF R, LiW J, et al. Faithful to the original: Fact aware neural abstractive summarization[L]. (2017-11-13) [2018-10-15]. https://arxiv.org/abs/1711.04434v1. 32 ChopraS, AuliM, RushA M. Abstractive sentence summarization with attentive recurrent neural networks[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2016: 93-98. 33 GehringJ, AuliM, GrangierD, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning. PMLR, 2017, 70: 1243-1252. 34 周健, 田萱, 崔晓晖. 基于改进Sequence-to-Sequence模型的文本摘要生成方法[J]. 计算机工程与应用, 2019, 55(1): 128-134. 35 SongS L, HuangH T, RuanT X. Abstractive text summarization using LSTM-CNN based deep learning[J]. Multimedia Tools and Applications, 2019, 78(1): 857-875. 36 GulcehreC, AhnS, NallapatiR, et al. Pointing the unknown words[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 140-149. 37 VinyalsO, FortunatoM, JaitlyN. Pointer networks[C]//Proceedings of the Conference on Advances in Neural Information Processing Systems, Montreal, Canada, 2015: 2692-2700. 38 SeeA, LiuP J, ManningC D. Get to the point: Summarization with pointer-generator networks[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1073-1083. 39 ZhouQ, YangN, WeiF, et al. Sequential copying networks[C]// Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4987-4995. 40 GuJ T, LuZ D, LiH, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1631-1640. 41 TuZ P, LuZ D, LiuY, et al. Modeling coverage for neural machine translation[OL]. (2016-8-6) [2018-10-15]. https://arxiv.org/abs/1601.04811. 42 SankaranB, MiH T, Al-OnaizanY, et al. Temporal attention model for neural machine translation[OL]. (2016-8-9) [2018-9-15]. https://arxiv.org/abs/1608.02927. 43 PaulusR, XiongC M, SocherR. A deep reinforced model for abstractive summarization[OL]. (2017-11-13) [2018-9-15]. https://arxiv.org/abs/1705.04304. 44 LiP J, LamW, BingL D, et al. Deep recurrent generative decoder for abstractive text summarization[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 2091-2100. 45 KingmaD P, WellingM. Auto encoding variational bayes[OL]. (2014-5-1) [2018-9-15]. https://arxiv.org/abs/1312.6114. 46 BowmanS R, VilnisL, VinyalsO, et al. Generating sentences from a continuous space[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 10-21. 47 GregorK, DanihelkaI, GravesA, et al. Draw: A recurrent neural network for image generation[C]//Proceedings of the 32nd International Conference on Machine Learning. PMLR, 2015, 37: 1462-1471. 48 Kry?cińskiW, PaulusR, XiongC M, et al. Improving abstraction in text summarization[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2018: 1808-1817. 49 AmplayoR K, LimS, HwangS. Entity commonsense representation for neural abstractive summarization[OL]. (2018-6-14) [2018-10-15]. https://arxiv.org/abs/1806.05504. 50 CibilsA, MusatC, HossmannA, et al. Diverse beam search for increased novelty in abstractive summarization[OL]. (2018-2-5) [2018-9-15]. https://arxiv.org/abs/1802.01457. 51 Ayana, ShenS Q, ZhaoY, et al. Neural headline generation with sentence-wise optimization[OL]. (2016-10-9) [2018-9-15]. https://arxiv.org/abs/1604.01904. 52 RanzatoM, ChopraS, AuliM, et al. Sequence level training with recurrent neural networks[OL]. (2015-11-20) [2018-10-15]. https://arxiv.org/abs/1511.06732. 53 RennieS J, MarcheretE, MrouehY, et al. Self-critical sequence training for image captioning[OL]. (2016-12-2) [2018-9-15]. https://arxiv.org/abs/1612.00563. 54 LiP J, BingL D, LamW. Actor-critic based training framework for abstractive summarization[OL]. (2018-8-15) [2018-9-15]. https://arxiv.org/abs/1803.11070. 55 KondaV R, TsitsiklisJ N. Actor-critic algorithms[C]//Proceedings of the Conference and Workshop on Neural Information Processing Systems, Denver, USA, 2000: 1008-1014. 56 ShiY S, MengJ, WangJ, et al. A normalized encoder-decoder model for abstractive summarization using focal loss[C]//Proceedings of the 7th CCF International Conference on Natural Language Processing and Chinese Computing. Cham: Springer, 2018, 11109s: 383-392. 57 LinT Y, GoyalP, GirshickR, et al. Focal loss for dense object detection[C]//Proceedings of IEEE International Conference on Computer Vision. IEEE, 2017: 2999-3007. 58 HuaX Y, WangL. A pilot study of domain adaptation effect for neural abstractive summarization[OL]. (2017-7-21) [2018-10-15]. https://arxiv.org/abs/1707.07062. 59 FanA, GrangierD, AuliM. Controllable abstractive summarization[OL]. (2018-5-18) [2018-9-15]. https://arxiv.org/abs/1711.05217. 60 ZhangJ M, TanJ W, WanX J. Towards a neural network approach to abstractive multi-document summarization[OL]. (2018-4-24) [2018-9-15]. https://arxiv.org/abs/1804.09010. 61 ChuE, LiuP J. Unsupervised neural multi-document abstractive summarization[OL]. (2018-10-12) [2018-10-15]. https://arxiv.org/abs/1810.05739. 62 NemaP, KhapraM M, LahaA, et al. Diversity driven attention model for query-based abstractive summarization[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1063-1072. 63 HasselqvistJ, HelmertzN, Kageback, M. Query-based abstractive summarization using neural networks[OL]. (2017-12-17) [2018-9-15]. https://arxiv.org/abs/1712.06100. 64 BaumelT, EyalM, ElhadadM. Query focused abstractive summarization: Incorporating query relevance, multi-document coverage, and summary length constraints into seq2seq models[OL]. (2018-1-23) [2018-9-15]. https://arxiv.org/abs/1801.07704. 65 PasunuruR, GuoH, BansalM. Towards improving abstractive summarization via entailment generation[C]//Proceedings of the Workshop on New Frontiers in Summarization. Stroudsburg: Association for Computational Linguistics, 2017: 27-32. 66 PasunuruR, BansalM. Multi-reward reinforced summarization with saliency and entailment[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 2: 646-653. 67 GuoH, PasunuruR, BansalM. Soft layer-specific multi-task summarization with entailment and question generation[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 687-697. 68 WangL, YaoJ L, TaoY Z, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[OL]. (2018-6-2) [2018-10-15]. https://arxiv.org/abs/1805.03616. 69 SongK Q, ZhaoL, LiuF. Structure-infused copy mechanisms for abstractive summarization[C]//Proceedings of the 27th International Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistic, 2018: 1717-1729. 70 HuB T, ChenQ C, ZhuF Z. LCSTS: A large scale Chinese short text summarization dataset[OL]. (2016-9-19) [2018-9-15]. https://arxiv.org/abs/1506.05865. 71 王毅, 谢娟, 成颖. 结合LSTM和CNN混合架构的深度神经网络语言模型[J]. 情报学报, 2018, 37(2): 194-205. 72 HuD C. An introductory survey on attention mechanisms in nlp problems[OL]. (2018-11-12) [2018-12-15]. https://arxiv.org/abs/1811.05544. 73 LinC Y, HovyE. Automatic evaluation of summaries using n-gram co-occurrence statistics[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology. Stroudsburg: Association for Computational Linguistics, 2003, 1: 71-78. 74 吴飞, 阳春华, 兰旭光, 等. 人工智能的回顾与展望[J]. 中国科学基金, 2018, 32(3): 243-250. 75 李亚超, 熊德意, 张民. 神经机器翻译综述[J]. 计算机学报, 2018, 41(12): 2734-2755.