|
|
Research on Construction and Application of a Knowledge Discovery System Based on Intelligent Processing of Large-scale Governmental Documents |
Zhao Hong1,2, Wang Fang1, Wang Xiaoyu1, Zhang Weichong1, Yang Jing1 |
1. Department of Information Resources Management, Business School, Nankai University, Tianjin 300071; 2. CETC Big Data Research Institute Co., Ltd., Guiyang 550081 |
|
|
Abstract Governmental documents are an important form of knowledge resource. Knowledge discovery in governmental document resources based on intelligent processing can help achieve the goal of intelligent knowledge management and promote administrative efficiency in document fiction, document approval, document circulation, and document archiving. It can also significantly promote building digital government and enhance the effectiveness of government governance. To implement intelligent processing of large-scale governmental documents, this paper proposes a series of processing stepsWould this a better alternative?, including content structure analysis, subject automatic indexing, abstractive summarization, key content extraction and sorting, knowledge extraction and linking of policy/decree/administrative enforcement documents, task dissociation, and right-responsibility matching in governmental decree documents. Based on the above processing steps, this paper constructs the knowledge discovery system and analyzes its application. Finally, it also presents a case of knowledge discovery in specific governmental documents.
|
Received: 03 May 2018
|
|
|
|
[1] 赵国俊. 电子政务教程[M]. 北京: 中国人民大学出版社, 2004: 62. [2] 中共中央办公厅, 国务院办公厅. 党政机关公文处理工作条例[EB/OL]. [2018-04-28]. http://www.gov.cn/zhengce/2013-02/22/ content_2640088.htm. [3] 上海市档案局. 上海自贸试验区开启电子文件归档和电子档案“单套制”管理新模式[EB/OL]. [2018-04-28]. http://pdda.pudong.gov.cn/pddaxxw_pddt/2016-11-18/Detail_775080.htm. [4] 国务院办公厅. 国务院办公厅关于印发政务信息系统整合共享实施方案的通知[EB/OL]. [2018-04-28]. http://www.gov.cn/ zhengce/content/2017-05/18/content_5194971.htm. [5] 国务院办公厅. 国务院办公厅关于促进电子政务协调发展的指导意见[EB/OL]. [2018-04-28]. http://www.xinjiang.gov.cn/ 2015/07/07/632.html. [6] 国家电子文件管理部际联席会议办公室. 党政机关电子公文系列标准[EB/OL]. [2018-04-28]. http://sca.gov.cn/sca/ztpd/ 2017-04/19/content_1012651.shtml. [7] 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014, 33(9): 979-985. [8] 黄永, 陆伟, 程齐凯, 等. 学术文本的结构功能识别——基于段落的识别[J]. 情报学报, 2016, 35(3): 530-538. [9] Semeniuta S, Severyn A, Barth E.A hybrid convolutional variational autoencoder for text generation[C]// Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 627-637. [10] Rush A M, Chopra S, Weston J.A neural attention model for abstractive sentence summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 379-389. [11] Vinyals O, Kaiser L, Koo T, et al.Grammar as a foreign language[C]// Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2015, 2: 2773-2781. [12] Paulus R, Xiong C, Socher R. A deep reinforced model for abstractive summarization[J]. arXiv preprint arXiv:1705. 04304, 2017. [13] Reiplinger M, Schäfer U, Wolska M.Extracting glossary sentences from scholarly articles: A comparative evaluation of pattern bootstrapping and deep analysis[C]// Proceedings of the ACL-2012 Special Workshop on Rediscovering 50 Years of Discoveries. Stroudsburg: Association for Computational Linguistics, 2012: 55-65. [14] Graves A, Mohamed A R, Hinton G.Speech recognition with deep recurrent neural networks[C]// Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, Canada, 2013: 6645-6649. [15] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508. 01991, 2015. [16] Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 357-370. [17] Ma X Z, Hovy E.End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1064-1074. [18] Qian F, Sha L, Chang B, et al. Syntax aware LSTM model for Chinese semantic role labeling[J]. arXiv preprint arXiv:1704. 00405, 2017. [19] Wang Z, Jiang T S, Chang B B, et al.Chinese semantic role labeling with bidirectional recurrent neural networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2015: 1626-1631. [20] 杨选选, 张蕾. 基于语义角色和概念图的信息抽取模型[J]. 计算机应用, 2010, 30(2): 411-414. [21] 鲍静, 张勇进. 政府部门数据治理: 一个亟需回应的基本问题[J]. 中国行政管理, 2017(4): 28-34. [22] 黄璜. 转换政府数据治理思维[J]. 领导科学, 2018(9): 21. [23] 杨冰之. 提升政府数据治理能力, 加快智慧社会建设[N]. 无锡日报, 2018-06-06(A12). |
|
|
|