生成式自动文摘的深度学习方法综述

doi:10.3772/j.issn.1000-0135.2020.03.010

情报学报

2020, Vol. 39

Issue (3): 330-344 DOI: 10.3772/j.issn.1000-0135.2020.03.010

研究进展与文献综述

本期目录 | 过刊浏览 | 高级检索

生成式自动文摘的深度学习方法综述

赵洪^1,2

1.南开大学商学院信息资源管理系，天津 300071
2.中电科大数据研究院有限公司，贵阳 550081

A Survey of Deep Learning Methods for Abstractive Text Summarization

Zhao Hong^1,2

1.Department of Information Resources Management, Business School, Nankai University, Tianjin 300071
2.CETC Big Data Research Institute Co. Ltd., Guiyang 550081

摘要
图/表
参考文献
相关文章 (15)

全文: PDF (3146 KB) HTML (222 KB)
输出: BibTeX | EndNote (RIS)

摘要自动文摘是文本挖掘的主要任务之一。相比于抽取式自动文摘，生成式自动文摘在思想上更接近人工摘要的过程，具有重要研究意义。近几年伴随着深度学习方法的发展，基于深层神经网络模型的生成式自动文摘也有了令人瞩目的发展。为了更全面地理解该类方法的思想和研究现状，本文从生成式自动文摘的任务描述入手，梳理了基于RNN（recurrent neural network，循环神经网络）的模型、基于CNN（convolutional neural network，卷积神经网络）的模型、基于RNN+CNN的模型、融合注意力机制的模型和融合强化学习的模型共五大类生成式自动文摘的深度学习方法。这类方法表明，在深层神经网络的训练下，特别是融合注意力机制和强化学习后，摘要效果得以明显提升。在生成式自动文摘研究的未来发展中，除深度学习方法本身的不断应用和改进外，还需关注如何有效实现篇章级语义理解下的摘要、面向不同文本对象特点的摘要和摘要结果自动评价等问题。此外，如何结合传统摘要研究中的成熟方法进一步提高摘要效果，也是一个很有价值的研究方向。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	赵洪

关键词 ：生成式自动文摘, 深度学习, 循环神经网络, 卷积神经网络, 注意力机制, 强化学习

收稿日期: 2018-12-04

基金资助:提升政府治理能力大数据应用技术国家工程实验室2017—2018年度开放基金重点支持项目“基于NLP和深度学习的大规模政府公文智能处理技术研究”；国家社会科学基金重大项目“我国网络社会治理研究”（14ZDA063）。

作者简介: 赵洪，男，1986年生，博士研究生，主要研究方向为机器学习与知识发现，E-mail:zhaohong@mail.nankai.edu.cn。

引用本文:

赵洪. 生成式自动文摘的深度学习方法综述[J]. 情报学报, 2020, 39(3): 330-344.
Zhao Hong. A Survey of Deep Learning Methods for Abstractive Text Summarization. 情报学报, 2020, 39(3): 330-344.

链接本文:

https://qbxb.istic.ac.cn/CN/10.3772/j.issn.1000-0135.2020.03.010 或 https://qbxb.istic.ac.cn/CN/Y2020/V39/I3/330

1 Wikipedia. Automatic summarization[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Automatic_summarization.
2 ChenH H, KuoJ J, HuangS J, et al. A summarization system for Chinese news from multiple sources[J]. Journal of the American Society for Information Science and Technology, 2003, 54(13):1224-1236.
3 DoumaN. Attention neural network-based abstractive summarization and headline generation[D]. Karlsruher: Karlsruhe Institute of Technology, 2018.
4 郭艳卿, 赵锐, 孔祥维, 等. 基于事件要素加权的新闻摘要提取方法[J]. 计算机科学, 2016, 43(1): 237-241.
5 高永兵, 钟振华, 王宇, 等. 基于混合方法的中文微博自动摘要技术研究[J]. 计算机工程与科学, 2016, 38(6): 1257-1261.
6 DuttaS, ChandraV, MehraK, et al. Ensemble algorithms for microblog summarization[J]. IEEE Intelligent Systems, 2018, 33(3): 4-14.
7 SharifiB P, InouyeD I, KalitaJ K. Summarization of Twitter microblogs[J]. The Computer Journal, 2014, 57(3): 378-402.
8 MaS M, SunX, LinJ Y, et al. Autoencoder as assistant supervisor: Improving text representation for Chinese social media text summarization[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 725-731.
9 刘家益, 李鲡瑶, 张智雄, 等. 关键词和被引次数对科技论文自动摘要效果影响研究[J]. 情报学报, 2017, 36(11): 1165-1174.
10 AllamanisM, PengH, SuttonC. A convolutional attention network for extreme summarization of source code[C]// Proceedings of the 33rd International Conference on Machine Learning. PMLR, 2016, 48: 2091-2100.
11 NikolovN I, PfeifferM, HahnloserR H R. Data-driven summarization of scientific articles[C]// Proceedings of the Eleventh International Conference on Language Resources and Evaluation. Paris: European Language Resources Association, 2018.
12 YelogluO, MiliosE, Zincir-HeywoodN. Multi-document summarization of scientific corpora[C]// Proceedings of the ACM Symposium on Applied Computing. New York: ACM Press, 2011: 252-258.
13 Al SaiedH, DuguéN, LamirelJ C. Automatic summarization of scientific publications using a feature selection approach[J]. International Journal on Digital Libraries, 2018, 19(2-3): 203-215.
14 任鹏杰. 基于有监督深度学习的抽取式多文档自动摘要研究[D]. 济南: 山东大学, 2018.
15 LuhnH P. The automatic creation of literature abstracts[J]. IBM Journal of Research and Development, 1958, 2(2): 159-165.
16 EdmundsonH P, WyllysR E. Automatic abstracting and indexing—Survey and recommendations[J]. Communications of the ACM, 1961, 4(5): 226-234.
17 PollockJ J, ZamoraA. Automatic abstracting research at chemical abstracts service[J]. Journal of Chemical Information and Computer Sciences, 1975, 15(4): 226-232.
18 PaiceC D. The automatic generation of literature abstracts: An approach based on the identification of self-indicating phrases[C]// Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval. New York: ACM Press, 1980: 172-191.
19 RauL F, JacobsP S, ZernikU. Information extraction and text summarization using linguistic knowledge acquisition[J]. Information Processing & Management, 1989, 25(4): 419-428.
20 SaltonG, SinghalA, MitraM, et al. Automatic text structuring and summarization[J]. Information Processing & Management, 1997, 33(2): 193-207.
21 RadevD R, JingH Y, BudzikowskaM. Centroid-based summarization of multiple documents: Sentence extraction, utility-based evaluation, and user studies[C]// Proceedings of the NAACL- ANLP Workshop on Automatic Summarization. Stroudsburg: Association for Computational Linguistics, 2000: 21-30.
22 AliguliyevR M. A new sentence similarity measure and sentence based extractive technique for automatic text summarization[J]. Expert Systems with Applications, 2009, 36(4): 7764-7772.
23 Yousefi-AzarM, HameyL. Text summarization using unsupervised deep learning[J]. Expert Systems with Applications, 2017, 68: 93-105.
24 MehtaP, MajumderP. Effective aggregation of various summarization techniques[J]. Information Processing & Management, 2018, 54(2): 145-158.
25 ErkanG, RadevD R. LexRank: Graph-based lexical centrality as salience in text summarization[J]. Journal of Artificial Intelligence Research, 2004, 22: 457-479.
26 RadevD R, JingH Y, Sty?M, et al. Centroid-based summarization of multiple documents[J]. Information Processing & Management, 2004, 40(6): 919-938.
27 LinC Y, HovyE. The automated acquisition of topic signatures for text summarization[C]// Proceedings of the 18th Conference on Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2000: 495-501.
28 HsuW T, LinC K, LeeM Y, et al. A unified model for extractive and abstractive summarization using inconsistency loss[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 132-141.
29 深度学习之文本摘要自动生成[EB/OL]. [2018-11-03]. https://blog.csdn.net/wjmnju/article/details/78710498.
30 AndhaleN, BewoorL A. An overview of text summarization techniques[C]// Proceedings of the International Conference on Computing Communication Control and Automation. IEEE, 2016: 1-7.
31 VinyalsO, KaiserL, KooT, et al. Grammar as a foreign language[OL]. [2020-02-25]. https://arxiv.org/abs/1412.7449.
32 LopyrevK. Generating news headlines with recurrent neural networks[OL]. [2020-02-25]. https://arxiv.org/abs/1512.01712.
33 NiuJ W, ChenH, ZhaoQ J, et al. Multi-document abstractive summarization using chunk-graph and recurrent neural network[C]// Proceedings of the IEEE International Conference on Communications. IEEE, 2017: 1-6.
34 刘家益, 邹益民. 近70年文本自动摘要研究综述[J]. 情报科学, 2017, 35(7): 154-161.
35 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2008: 380.
36 刘福君. 基于指代消解的自动文摘研究[D]. 合肥: 安徽大学, 2012.
37 刘挺, 王开铸. 自动文摘的四种主要方法[J]. 情报学报, 1999, 18(1): 10-19.
38 ChoK, van Merri?nboerB, GulcehreC, et al. Learning phrase representations using RNN encoder–decoder for statistical machine translation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1724-1734.
39 赵洪, 王芳, 王晓宇, 等. 基于大规模政府公文智能处理的知识发现及应用研究[J]. 情报学报, 2018, 37(8): 805-812.
40 RushA M, ChopraS, WestonJ. A neural attention model for abstractive sentence summarization[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 379-389.
41 Document Understanding Conferences. Past data[EB/OL]. [2019-03-02]. https://duc.nist.gov/data.html.
42 Text Analysis Conference. Past TAC data[EB/OL]. [2019-03-02]. https://tac.nist.gov/data/index.html.
43 HermannK M, Ko?iskyT, GrefenstetteE, et al. Teaching machines to read and comprehend[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015, 1: 1693-1701.
44 SandhausE. The New York Times annotated corpus[EB/OL]. [2019-03-02]. https://catalog.ldc.upenn.edu/LDC2008T19.
45 GraffD, CieriC. English gigaword[EB/OL]. [2019-03-02]. https://catalog.ldc.upenn.edu/LDC2003T05.
46 HuB T, ChenQ C, ZhuF Z. LCSTS: A large scale Chinese short text summarization dataset[EB/OL]. [2019-03-02]. http://icrc.hitsz.edu.cn/Article/show/139.html.
47 NLPCC 2017. NLPCC 2017 shared data[EB/OL]. [2019-03-02]. http://tcci.ccf.org.cn/conference/2017/taskdata.php.
48 NLPCC 2018. NLPCC 2018 shared data[EB/OL]. [2019-03-02]. http://tcci.ccf.org.cn/conference/2018/taskdata.php.
49 搜狗实验室. 数据资源[EB/OL]. [2019-03-02]. https://www.sogou.com/labs/resource/list_news.php.
50 LinC Y. ROUGE: A package for automatic evaluation of summaries[C]// Proceedings of the Workshop on Text Summarization Branches Out. Stroudsburg: Association for Computational Linguistics, 2004: 74-81.
51 NgJ P, AbrechtV. Better summarization evaluation with word embeddings for ROUGE[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1925-1930.
52 LiP, CaiW D, HuangH. Weakly supervised natural language processing framework for abstractive multi-document summarization[C]// Proceedings of the 24th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2015: 1401-1410.
53 BanerjeeS, MitraP, SugiyamaK. Multi-document abstractive summarization using ILP based multi-sentence compression[C]// Proceedings of the 24th International Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, 2015: 1208-1214.
54 ZhangJ M, TanJ W, WanX J. Towards a neural network approach to abstractive multi-document summarization[OL]. [2020-03-05]. https://arxiv.org/abs/1804.09010.
55 MikolovT, KarafiátM, BurgetL, et al. Recurrent neural network based language model[C]// Proceedings of the 14th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, 2010: 1045-1048.
56 赵洪, 王芳. 理论术语抽取的深度学习模型及自训练算法研究[J]. 情报学报, 2018, 37(9): 923-938.
57 SutskeverI, VinyalsO, LeQ V. Sequence to sequence learning with neural networks[OL]. [2020-03-05]. https://arxiv.org/abs/1409.3215.
58 LiP J, LamW, BingL D, et al. Deep recurrent generative decoder for abstractive text summarization[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 2081-2090.
59 Al-SabahiK, ZhangZ P, YangK. Bidirectional attentional encoder-decoder model and bidirectional beam search for abstractive summarization[OL]. [2020-03-05]. https://arxiv.org/abs/1809.06662.
60 YeasminS, TumpaP B, NituA M, et al. Study of abstractive text summarization techniques[J]. American Journal of Engineering Research, 2017, 6(8): 253-260.
61 GenestP E, LapalmeG. Text generation for abstractive summarization[C]// Proceedings of the Third Text Analysis Conference, TAC 2010 Workshop, Gaithersburg, Maryland, USA, 2010.
62 searchBest-first[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Best-first_search.
63 searchRandom[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Random_search.
64 searchBeam[EB/OL]. [2018-11-03]. https://en.wikipedia.org/wiki/Beam_search.
65 SuzukiJ, NagataM. Cutting-off redundant repeating generations for neural abstractive summarization[C]// Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017, 2: 291-297.
66 ZhangY, ShenD, WangG, et al. Deconvolutional paragraph representation learning[OL]. [2020-03-05]. https://arxiv.org/abs/1708.04729.
67 GehringJ, AuliM, GrangierD, et al. A convolutional encoder model for neural machine translation[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 123-135.
68 GehringJ, AuliM, GrangierD, et al. Convolutional sequence to sequence learning[C]// Proceedings of the 34th International Conference on Machine Learning. IMLS, 2017, 70: 1243-1252.
69 van den OordA, KalchbrennerN, VinyalsO, et al. Conditional image generation with PixelCNN decoders[OL]. [2020-03-05]. https://arxiv.org/abs/1606.05328.
70 DauphinY N, FanA, AuliM, et al. Language modeling with gated convolutional networks[J]. Proceedings of Machine Learning Research, 2017, 70: 933-941.
71 HeK M, ZhangX Y, RenS Q, et al. Deep residual learning for image recognition[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016: 770-778.
72 SongS L, HuangH T, RuanT X. Abstractive text summarization using LSTM-CNN based deep learning[J]. Multimedia Tools and Applications, 2019, 78(1): 857-875.
73 张俊林. 深度学习中的注意力机制(2017版)[EB/OL]. [2019-03-01]. https://blog.csdn.net/malefactor/article/details/78767781.
74 TishbyN, ZaslavskyN. Deep learning and the information bottleneck principle[C]// Proceedings of the IEEE Information Theory Workshop. IEEE, 2015.
75 VaswaniA, ShazeerN, ParmarN, et al. Attention is all you need[OL]. [2020-03-05]. https://arxiv.org/abs/1706.03762.
76 LuongM T, PhamH, ManningC D. Effective approaches to attention-based neural machine translation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1412-1421.
77 NallapatiR, ZhouB W, dos SantosC, et al. Abstractive text summarization using sequence-to-sequence RNNs and beyond[C]// Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Stroudsburg: Association for Computational Linguistics, 2016: 280-290.
78 SankaranB, MiH T, Al-OnaizanY, et al. Temporal attention model for neural machine translation[OL]. [2020-03-08]. https://arxiv.org/abs/1608.02927.
79 GuJ T, LuZ D, LiH, et al. Incorporating copying mechanism in sequence-to-sequence learning[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016, 1: 1631-1640.
80 WangL, YaoJ L, TaoY Z, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]// Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2018: 4453-4460.
81 DengY T, KimY, ChiuJ, et al. Latent alignment and variational attention[OL]. [2020-03-08]. https://arxiv.org/pdf/1807.03756.pdf.
82 深度学习中的Attention模型介绍及其进展[EB/OL]. [2019-03-01]. https://blog.csdn.net/jteng/article/details/52864401.
83 BahdanauD, BrakelP, XuK, et al. An actor-critic algorithm for sequence prediction[C]// [2020-03-08]. https://arxiv.org/abs/1607.07086.
84 姚均霖. 当深度学习遇见自动文本摘要[EB/OL]. [2018-11-30]. https://cloud.tencent.com/developer/article/1005548.
85 李晨溪, 曹雷, 张永亮, 等. 基于知识的深度强化学习研究综述[J]. 系统工程与电子技术, 2017, 39(11): 2603-2613.
86 RanzatoM, ChopraS, AuliM, et al. Sequence level training with recurrent neural networks[OL]. [2020-03-08]. https://arxiv.org/abs/1511.06732.
87 RennieS J, MarcheretE, MrouehY, et al. Self-critical sequence training for image captioning[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2017.
88 PaulusR, XiongC M, SocherR. A deep reinforced model for abstractive summarization[OL]. [2020-03-08]. https://arxiv.org/abs/1705.04304.
89 LiuC W, LoweR, SerbanI V, et al. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 2122-2132.
90 KeneshlooY, ShiT, RamakrishnanN, et al. Deep reinforcement learning for sequence to sequence models[J/OL]. IEEE Transactions on Neural Networks and Learning Systems, 2019, https://ieeexplore.ieee.org/document/8801910.
91 LiP J, BingL D, LamW. Actor-critic based training framework for abstractive summarization[OL]. [2020-03-08]. https://arxiv.org/abs/1803.11070.
92 BahdanauD, ChoK, BengioY. Neural machine translation by jointly learning to align and translate[OL]. [2020-03-08]. https://arxiv.org/abs/1409.0473.
93 ChenY C, BansalM. Fast abstractive summarization with reinforce-selected sentence rewriting[C]// Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2018: 675-686.
94 SeeA, LiuP J, ManningC D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1073-1083.
95 MnihV, BadiaA P, MirzaM, et al. Asynchronous methods for deep reinforcement learning[OL]. [2020-03-08]. https://arxiv.org/abs/1602.01783.