|
|
A Survey of Deep Learning Methods for Abstractive Text Summarization |
Zhao Hong1,2 |
1.Department of Information Resources Management, Business School, Nankai University, Tianjin 300071 2.CETC Big Data Research Institute Co. Ltd., Guiyang 550081 |
|
|
Abstract Abstractive text summarization (ATS) is a main topic of research in text mining. Compared with extractive text summarization, which extracts shallow meaning from a text, ATS more closely resembles the process of human summarization, giving it important research significance. With the development of deep learning methods and deep neural networks in recent years, remarkable progress in ATS has been made. To gain a more comprehensive understanding of the theory and state of research on ATS, this paper describes the ATS task and combines five deep learning methods to support it, namely recurrent neural networks (RNN), convolutional neural networks (CNN), RNN+CNN, attentional models, and reinforced models. These results show that ATS performance can be improved significantly through deep neural network training, especially after joining attention mechanisms and reinforcement learning. In future development of ATS, in addition to continued application and improvement of deep learning methods themselves, researchers must consider to how to effectively implement ATS with text-level semantic comprehension, ATS of more text categories, and ATS evaluation. Integration of mature traditional research methods to further improve ATS performance is also a valuable direction for future research.
|
Received: 04 December 2018
|
|
|
|
1 Wikipedia. Automatic summarization[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Automatic_summarization. 2 ChenH H, KuoJ J, HuangS J, et al. A summarization system for Chinese news from multiple sources[J]. Journal of the American Society for Information Science and Technology, 2003, 54(13):1224-1236. 3 DoumaN. Attention neural network-based abstractive summarization and headline generation[D]. Karlsruher: Karlsruhe Institute of Technology, 2018. 4 郭艳卿, 赵锐, 孔祥维, 等. 基于事件要素加权的新闻摘要提取方法[J]. 计算机科学, 2016, 43(1): 237-241. 5 高永兵, 钟振华, 王宇, 等. 基于混合方法的中文微博自动摘要技术研究[J]. 计算机工程与科学, 2016, 38(6): 1257-1261. 6 DuttaS, ChandraV, MehraK, et al. Ensemble algorithms for microblog summarization[J]. IEEE Intelligent Systems, 2018, 33(3): 4-14. 7 SharifiB P, InouyeD I, KalitaJ K. Summarization of Twitter microblogs[J]. The Computer Journal, 2014, 57(3): 378-402. 8 MaS M, SunX, LinJ Y, et al. Autoencoder as assistant supervisor: Improving text representation for Chinese social media text summarization[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 725-731. 9 刘家益, 李鲡瑶, 张智雄, 等. 关键词和被引次数对科技论文自动摘要效果影响研究[J]. 情报学报, 2017, 36(11): 1165-1174. 10 AllamanisM, PengH, SuttonC. A convolutional attention network for extreme summarization of source code[C]// Proceedings of the 33rd International Conference on Machine Learning. PMLR, 2016, 48: 2091-2100. 11 NikolovN I, PfeifferM, HahnloserR H R. Data-driven summarization of scientific articles[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation. Paris: European Language Resources Association, 2018. 12 YelogluO, MiliosE, Zincir-HeywoodN. Multi-document summarization of scientific corpora[C]// Proceedings of the ACM Symposium on Applied Computing. New York: ACM Press, 2011: 252-258. 13 Al SaiedH, DuguéN, LamirelJ C. Automatic summarization of scientific publications using a feature selection approach[J]. International Journal on Digital Libraries, 2018, 19(2-3): 203-215. 14 任鹏杰. 基于有监督深度学习的抽取式多文档自动摘要研究[D]. 济南: 山东大学, 2018. 15 LuhnH P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165. 16 EdmundsonH P, WyllysR E. Automatic abstracting and indexing—Survey and recommendations[J]. Communications of the ACM, 1961, 4(5): 226-234. 17 PollockJ J, ZamoraA. Automatic abstracting research at chemical abstracts service[J]. Journal of Chemical Information and Computer Sciences, 1975, 15(4): 226-232. 18 PaiceC D. The automatic generation of literature abstracts: An approach based on the identification of self-indicating phrases[C]// Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval. New York: ACM Press, 1980: 172-191. 19 RauL F, JacobsP S, ZernikU. Information extraction and text summarization using linguistic knowledge acquisition[J]. Information Processing & Management, 1989, 25(4): 419-428. 20 SaltonG, SinghalA, MitraM, et al. Automatic text structuring and summarization[J]. Information Processing & Management, 1997, 33(2): 193-207. 21 RadevD R, JingH Y, BudzikowskaM. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies[C]// Proceedings of the NAACL- ANLP Workshop on Automatic Summarization. Stroudsburg: Association for Computational Linguistics, 2000: 21-30. 22 AliguliyevR M. A new sentence similarity measure and sentence based extractive technique for automatic text summarization[J]. Expert Systems with Applications, 2009, 36(4): 7764-7772. 23 Yousefi-AzarM, HameyL. Text summarization using unsupervised deep learning[J]. Expert Systems with Applications, 2017, 68: 93-105. 24 MehtaP, MajumderP. Effective aggregation of various summarization techniques[J]. Information Processing & Management, 2018, 54(2): 145-158. 25 ErkanG, RadevD R. LexRank: Graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research, 2004, 22: 457-479. 26 RadevD R, JingH Y, Sty?M, et al. Centroid-based summarization of multiple documents[J]. Information Processing & Management, 2004, 40(6): 919-938. 27 LinC Y, HovyE. The automated acquisition of topic signatures for text summarization[C]// Proceedings of the 18th Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2000: 495-501. 28 HsuW T, LinC K, LeeM Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 132-141. 29 深度学习之文本摘要自动生成[EB/OL]. [2018-11-03]. https://blog.csdn.net/wjmnju/article/details/78710498. 30 AndhaleN, BewoorL A. An overview of text summarization techniques[C]// Proceedings of the International Conference on Computing Communication Control and Automation. IEEE, 2016: 1-7. 31 VinyalsO, KaiserL, KooT, et al. Grammar as a foreign language[OL]. [2020-02-25]. https://arxiv.org/abs/1412.7449. 32 LopyrevK. Generating news headlines with recurrent neural networks[OL]. [2020-02-25]. https://arxiv.org/abs/1512.01712. 33 NiuJ W, ChenH, ZhaoQ J, et al. Multi-document abstractive summarization using chunk-graph and recurrent neural network[C]// Proceedings of the IEEE International Conference on Communications. IEEE, 2017: 1-6. 34 刘家益, 邹益民. 近70年文本自动摘要研究综述[J]. 情报科学, 2017, 35(7): 154-161. 35 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008: 380. 36 刘福君. 基于指代消解的自动文摘研究[D]. 合肥: 安徽大学, 2012. 37 刘挺, 王开铸. 自动文摘的四种主要方法[J]. 情报学报, 1999, 18(1): 10-19. 38 ChoK, van Merri?nboerB, GulcehreC, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1724-1734. 39 赵洪, 王芳, 王晓宇, 等. 基于大规模政府公文智能处理的知识发现及应用研究[J]. 情报学报, 2018, 37(8): 805-812. 40 RushA M, ChopraS, WestonJ. A neural attention model for abstractive sentence summarization[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 379-389. 41 Document Understanding Conferences. Past data[EB/OL]. [2019-03-02]. https://duc.nist.gov/data.html. 42 Text Analysis Conference. Past TAC data[EB/OL]. [2019-03-02]. https://tac.nist.gov/data/index.html. 43 HermannK M, Ko?iskyT, GrefenstetteE, et al. Teaching machines to read and comprehend[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015, 1: 1693-1701. 44 SandhausE. The New York Times annotated corpus[EB/OL]. [2019-03-02]. https://catalog.ldc.upenn.edu/LDC2008T19. 45 GraffD, CieriC. English gigaword[EB/OL]. [2019-03-02]. https://catalog.ldc.upenn.edu/LDC2003T05. 46 HuB T, ChenQ C, ZhuF Z. LCSTS: A large scale Chinese short text summarization dataset[EB/OL]. [2019-03-02]. http://icrc.hitsz.edu.cn/Article/show/139.html. 47 NLPCC 2017. NLPCC 2017 shared data[EB/OL]. [2019-03-02]. http://tcci.ccf.org.cn/conference/2017/taskdata.php. 48 NLPCC 2018. NLPCC 2018 shared data[EB/OL]. [2019-03-02]. http://tcci.ccf.org.cn/conference/2018/taskdata.php. 49 搜狗实验室. 数据资源[EB/OL]. [2019-03-02]. https://www.sogou.com/labs/resource/list_news.php. 50 LinC Y. ROUGE: A package for automatic evaluation of summaries[C]// Proceedings of the Workshop on Text Summarization Branches Out. Stroudsburg: Association for Computational Linguistics, 2004: 74-81. 51 NgJ P, AbrechtV. Better summarization evaluation with word embeddings for ROUGE[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1925-1930. 52 LiP, CaiW D, HuangH. Weakly supervised natural language processing framework for abstractive multi-document summarization[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2015: 1401-1410. 53 BanerjeeS, MitraP, SugiyamaK. Multi-document abstractive summarization using ILP based multi-sentence compression[C]// Proceedings of the 24th International Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, 2015: 1208-1214. 54 ZhangJ M, TanJ W, WanX J. Towards a neural network approach to abstractive multi-document summarization[OL]. [2020-03-05]. https://arxiv.org/abs/1804.09010. 55 MikolovT, KarafiátM, BurgetL, et al. Recurrent neural network based language model[C]// Proceedings of the 14th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, 2010: 1045-1048. 56 赵洪, 王芳. 理论术语抽取的深度学习模型及自训练算法研究[J]. 情报学报, 2018, 37(9): 923-938. 57 SutskeverI, VinyalsO, LeQ V. Sequence to sequence learning with neural networks[OL]. [2020-03-05]. https://arxiv.org/abs/1409.3215. 58 LiP J, LamW, BingL D, et al. Deep recurrent generative decoder for abstractive text summarization[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 2081-2090. 59 Al-SabahiK, ZhangZ P, YangK. Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization[OL]. [2020-03-05]. https://arxiv.org/abs/1809.06662. 60 YeasminS, TumpaP B, NituA M, et al. Study of abstractive text summarization techniques[J]. American Journal of Engineering Research, 2017, 6(8): 253-260. 61 GenestP E, LapalmeG. Text generation for abstractive summarization[C]// Proceedings of the Third Text Analysis Conference, TAC 2010 Workshop, Gaithersburg, Maryland, USA, 2010. 62 searchBest-first[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Best-first_search. 63 searchRandom[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Random_search. 64 searchBeam[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Beam_search. 65 SuzukiJ, NagataM. Cutting-off redundant repeating generations for neural abstractive summarization[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017, 2: 291-297. 66 ZhangY, ShenD, WangG, et al. Deconvolutional paragraph representation learning[OL]. [2020-03-05]. https://arxiv.org/abs/1708.04729. 67 GehringJ, AuliM, GrangierD, et al. A convolutional encoder model for neural machine translation[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 123-135. 68 GehringJ, AuliM, GrangierD, et al. Convolutional sequence to sequence learning[C]// Proceedings of the 34th International Conference on Machine Learning. IMLS, 2017, 70: 1243-1252. 69 van den OordA, KalchbrennerN, VinyalsO, et al. Conditional image generation with PixelCNN decoders[OL]. [2020-03-05]. https://arxiv.org/abs/1606.05328. 70 DauphinY N, FanA, AuliM, et al. Language modeling with gated convolutional networks[J]. Proceedings of Machine Learning Research, 2017, 70: 933-941. 71 HeK M, ZhangX Y, RenS Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016: 770-778. 72 SongS L, HuangH T, RuanT X. Abstractive text summarization using LSTM-CNN based deep learning[J]. Multimedia Tools and Applications, 2019, 78(1): 857-875. 73 张俊林. 深度学习中的注意力机制(2017版)[EB/OL]. [2019-03-01]. https://blog.csdn.net/malefactor/article/details/78767781. 74 TishbyN, ZaslavskyN. Deep learning and the information bottleneck principle[C]// Proceedings of the IEEE Information Theory Workshop. IEEE, 2015. 75 VaswaniA, ShazeerN, ParmarN, et al. Attention is all you need[OL]. [2020-03-05]. https://arxiv.org/abs/1706.03762. 76 LuongM T, PhamH, ManningC D. Effective approaches to attention-based neural machine translation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1412-1421. 77 NallapatiR, ZhouB W, dos SantosC, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 280-290. 78 SankaranB, MiH T, Al-OnaizanY, et al. Temporal attention model for neural machine translation[OL]. [2020-03-08]. https://arxiv.org/abs/1608.02927. 79 GuJ T, LuZ D, LiH, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016, 1: 1631-1640. 80 WangL, YaoJ L, TaoY Z, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]// Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4453-4460. 81 DengY T, KimY, ChiuJ, et al. Latent alignment and variational attention[OL]. [2020-03-08]. https://arxiv.org/pdf/1807.03756.pdf. 82 深度学习中的Attention模型介绍及其进展[EB/OL]. [2019-03-01]. https://blog.csdn.net/jteng/article/details/52864401. 83 BahdanauD, BrakelP, XuK, et al. An actor-critic algorithm for sequence prediction[C]// [2020-03-08]. https://arxiv.org/abs/1607.07086. 84 姚均霖. 当深度学习遇见自动文本摘要[EB/OL]. [2018-11-30]. https://cloud.tencent.com/developer/article/1005548. 85 李晨溪, 曹雷, 张永亮, 等. 基于知识的深度强化学习研究综述[J]. 系统工程与电子技术, 2017, 39(11): 2603-2613. 86 RanzatoM, ChopraS, AuliM, et al. Sequence level training with recurrent neural networks[OL]. [2020-03-08]. https://arxiv.org/abs/1511.06732. 87 RennieS J, MarcheretE, MrouehY, et al. Self-critical sequence training for image captioning[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017. 88 PaulusR, XiongC M, SocherR. A deep reinforced model for abstractive summarization[OL]. [2020-03-08]. https://arxiv.org/abs/1705.04304. 89 LiuC W, LoweR, SerbanI V, et al. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 2122-2132. 90 KeneshlooY, ShiT, RamakrishnanN, et al. Deep reinforcement learning for sequence to sequence models[J/OL]. IEEE Transactions on Neural Networks and Learning Systems, 2019, https://ieeexplore.ieee.org/document/8801910. 91 LiP J, BingL D, LamW. Actor-critic based training framework for abstractive summarization[OL]. [2020-03-08]. https://arxiv.org/abs/1803.11070. 92 BahdanauD, ChoK, BengioY. Neural machine translation by jointly learning to align and translate[OL]. [2020-03-08]. https://arxiv.org/abs/1409.0473. 93 ChenY C, BansalM. Fast abstractive summarization with reinforce-selected sentence rewriting[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 675-686. 94 SeeA, LiuP J, ManningC D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1073-1083. 95 MnihV, BadiaA P, MirzaM, et al. Asynchronous methods for deep reinforcement learning[OL]. [2020-03-08]. https://arxiv.org/abs/1602.01783. |
|
|
|