理论术语抽取的深度学习模型及自训练算法研究

doi:10.3772/j.issn.1000-0135.2018.09.007

情报学报

2018, Vol. 37

Issue (9): 923-938 DOI: 10.3772/j.issn.1000-0135.2018.09.007

Current Issue | Archive | Adv Search

A Deep Learning Model and Self-Training Algorithm for Theoretical Terms Extraction

Zhao Hong, Wang Fang

Department of Information Resources Management, Business School, Nankai University, Tianjin 300071

Abstract
Figure/Table
References
Related Citation (15)

Download: PDF (794 KB) HTML (1 KB)
Export: BibTeX | EndNote (RIS)

Abstract Extraction of theoretical terminology from literature is a precondition for more than one research field in information science, such as content analysis of large scale literature and interdisciplinary knowledge transfer revelation. As specific types of named entities, theoretical terms are distributed among many subjects and a large section of published literature, have complex characteristics, and lack large-scale mature corpuses, rendering their extraction quite challenging. To improve the extraction performance and reduce the cost of manual tagging for the training set, a deep learning model for theoretical term extraction was built based on the characteristics of the terms and a self-training algorithm aimed at achieving a weak supervised learning of the model; further, the characteristic construction and tagging method in the model were studied. The validities of the model and the self-training algorithm were verified via experimental comparisons. This study not only provides a more effective method for theoretical term extraction but also provides a reference for the recognition of other named entities.

Key words： theoretical terms extraction deep learning recurrent neural network (RNN) bidirectional-long short term memory-conditional random field (Bi-LSTM-CRF) self-training

Received: 02 February 2018

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Zhao Hong
	Wang Fang

Cite this article:

Zhao Hong,Wang Fang. A Deep Learning Model and Self-Training Algorithm for Theoretical Terms Extraction[J]. 情报学报, 2018, 37(9): 923-938.

URL:

https://qbxb.istic.ac.cn/EN/10.3772/j.issn.1000-0135.2018.09.007 OR https://qbxb.istic.ac.cn/EN/Y2018/V37/I9/923

[1] 维基百科. 理论[EB/OL]. [2018-01-25].https://zh.wikipedia.org/ wiki/%E7%90%86%E8%AB%96.
[2] 王芳, 陈锋, 祝娜, 等. 我国情报学理论的来源、应用及学科专属度研究[J]. 情报学报, 2016, 35(11): 1148-1164.
[3] 陈锋, 翟羽佳, 王芳. 基于条件随机场的学术期刊中理论的自动识别方法[J]. 图书情报工作, 2016, 60(2): 122-128.
[4] 陆伟, 孟睿, 刘兴帮. 面向引用关系的引文内容标注框架研究[J]. 中国图书馆学报, 2014, 35(6): 93-104.
[5] 徐庶睿, 卢超, 章成志. 术语引用视角下的学科交叉测度——以PLOS ONE上六个学科为例[J]. 情报学报, 2017, 36(8): 809-820.
[6] Huang Z, Xu W, Yu K. Bidirectional LSTM-CRF models for sequence tagging[J]. arXiv preprint arXiv:1508.01991, 2015.
[7] Rondeau M A, Su Y.LSTM-based NeuroCRFs for named entity recognition[C]// INTERSPEECH 2016: The 17th Annual Conference of the International Speech Communication Association, San Francisco, CA, USA, 2016: 665-669.
[8] 化柏林. 针对中文学术文献的情报方法术语抽取[J]. 现代图书情报技术, 2013(6): 68-75.
[9] Collobert R, Weston J, Karlen M, et al.Natural language processing (almost) from Scratch[J]. Journal of Machine Learning Research, 2011, 12(1): 2493-2537.
[10] Sundermeyer M, Schlüter R, Ney H.LSTM neural networks for language modeling[C]// INTERSPEECH 2012: The 13th Annual Conference of the International Speech Communication Association, Portland, Oregon, USA, 2012: 601-608.
[11] Graves A, Mohamed A R, Hinton G.Speech recognition with deep recurrent neural networks[C]// Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2013: 6645-6649.
[12] Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 2016, 4: 357-370.
[13] Lample G, Ballesteros M, Subramanian S, et al.Neural architectures for named entity recognition[C]// Proceedings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2016: 260-270.
[14] Ma X Z, Hovy E.End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF[C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2016: 1064-1074.
[15] Limsopatham N, Collier N.Bidirectional LSTM for named entity recognition in Twitter messages[C]// Proceedings of the 2nd Workshop on Noisy User-generated Text, Osaka, Japan, 2016: 145-152.
[16] He H, Sun X.A unified model for cross-domain and semi-supervised named entity recognition in Chinese social media[C]// Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2017: 3216-3222.
[17] Pham T H, Le-Hong P.End-to-end recurrent neural network models for vietnamese named entity recognition: Word-level vs. character-level[C]// Proceedings of the International Conference of the Pacific Association for Computational Linguistics. Singapore: Springer, 2018, 781: 219-232 .
[18] Dong C H, Wu H J, Zhang J J, et al.Multichannel LSTM-CRF for named entity recognition in Chinese social media[C]// Proceedings of the Sixteenth China National Conference on Computational Linguistics. Cham: Springer, 2017, 10565: 197-208.
[19] Yi H K, Huang J M, Yang S Q. A Chinese Named Entity Recognition System with Neural Networks[C]// Proceedings of the 4th International Conference on Information Technology and Applications. EDP Sciences, 2017: Article No. 04002.
[20] Peters M E, Ammar W, Bhagavatula C, et al.Semi-supervised sequence tagging with bidirectional language models[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1756-1765.
[21] Ni J, Dinu G, Florian R.Weakly supervised cross-lingual named entity recognition via effective annotation and representation projection[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2017: 1470-1480.
[22] Rei M, Crichton G K O, Pyysalo S. Attending to characters in neural sequence labeling models[C]// Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 2016: 309-318.
[23] Sun Y Q, Li L, Xie Z W, et al.Co-training an improved recurrent neural network with probability statistic models for named entity recognition[C]// Proceedings of the 22nd International Conference on Database Systems for Advanced Applications. Cham: Springer, 2017: 545-555.
[24] Shen Y Y, Yun H, Lipton Z C, et al.Deep active learning for named entity recognition[C]// Proceedings of the 2nd Workshop on Representation Learning for NLP. Stroudsburg: Association for Computational Linguistics, 2017: 252-256.
[25] Yang Z, Salakhutdinov R, Cohen W W.Transfer learning for sequence tagging with hierarchical recurrent networks[C]// Proceedings of the 5th International Conference on Learning Representations, Toulon, France, 2017.
[26] Mikolov T, Karafiát M, Burget L, et al.Recurrent neural network based language model[C]// Proceedings of the 14th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, 2010: 1045-1048.
[27] Sundermeyer M, Schlüter R, Ney H.LSTM Neural Networks for Language Modeling[C]// Proceedings of the 13th Annual Conference of the International Speech Communication Association, Portland, USA, 2012: 601-608.
[28] Lafferty J D, Mccallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the Eighteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers, 2001: 282-289.
[29] 王芳. 情报学的范式变迁及元理论研究[J]. 情报学报, 2007, 26(5): 764-773.
[30] 王芳, 史海燕, 纪雪梅. 我国情报学研究中理论的应用: 基于《情报学报》的内容分析[J]. 情报学报, 2015, 34(6): 581-591.
[31] 孙小礼. 数学·科学·哲学[M]. 北京: 光明日报出版社, 1988: 195-209.
[32] Hinton G E.Learning distributed representations of concepts[C]// Proceedings of the 8th Annual Conference of the Cognitive Science Society, Amherst, USA, 1986: 1-12.
[33] Mikolov T, Corrado G, Chen K, et al.Efficient estimation of word representations in vector space[C]// Proceedings of the International Conference on Learning Representations, Scottsdale, Arizona, USA, 2013: 1-12.
[34] 张剑, 屈丹, 李真. 基于词向量特征的循环神经网络语言模型[J]. 模式识别与人工智能, 2015, 28(4): 299-305.
[35] Hinton G E, Srivastava N, Krizhevsky A, et al.Improving neural networks by preventing co-adaptation of feature detectors[J]. Computer Science, 2012, 3(4): 212-223.
[36] Hagenauer J, Hoeher P.A Vitrbi algorithm with soft-decision outputs and its applications[C]// Proceedings of the IEEE Global Telecommunications Conference and Exhibition ‘Communications Technology for the 1990s and Beyond’, 1989: 47.
[37] 裴文端, 罗伟雄, 李文铎. SOVA译码算法与性能[J]. 无线电工程, 2003, 33(11): 11-13.
[38] 姜小波, 陈杰, 仇玉林. 一种简化的SOVA算法[J]. 电子器件, 2004, 27(3): 467-469.
[39] 杨建祖, 顾小卓, 杜晓宁, 等. SOVA算法对Viterbi算法的修正[J]. 通信技术, 2007(4): 4-6.
[40] 滕少华. 基于CRFs的中文分词和短文本分类技术[D]. 北京: 清华大学, 2009.
[41] 高兴龙, 张鹏远, 张震, 等. 基于条件随机场的词级别置信度研究[C]// 中国科学院声学研究所第四届青年学术会议论文集, 2012: 290-293.
[42] 闫紫飞, 姬东鸿. 基于CRF和半监督学习的中文时间信息抽取[J]. 计算机工程与设计, 2015, 36(6): 1642-1646.
[43] 陈季梦, 刘杰, 黄亚楼, 等. 基于半监督CRF的缩略词扩展解释识别[J]. 计算机工程, 2013, 39(4): 203-209.
[44] Murthy V R, Bhattacharyya P.A deep learning solution to named entity recognition[C]// Proceedings of the International Conference on Intelligent Text Processing and Computational Linguistics. Cham: Springer, 2016: 427-438.

Editorial Office: JCSSTI Editorial Office, No.15 fuxing road, haidian, Beijing 100038
Tel: +86(010)68598273; Fax: +86(010)68598285; E-mail: qbxb@istic.ac.cn
Copyright © 2015 by the Journal of The China Society for Scientific and Technical Information
ISSN: 1000-0135 CN: 11-2257 / G3