|
|
Automatic Recognition of Research Methods from the Full-text of Academic Articles |
Zhang Chengzhi, Zhang Yingyi |
Department of Information Management, School of Economics and Management, Nanjing University of Science & Technology, Nanjing 210094 |
|
|
Abstract The degree ofresearch methods standardization marks the maturity of a discipline s development. In information science, theoretical analysis and normative research has gradually started to attract the attention of researchers. However, there is a lack of research on quantitative analyses of research methods. In addition, when a research method appears in an academic article, this implies that either the research method is used in the article, or it is just cited for analysis or comparison. Through research methods, researchers can quickly understand the key contents of the academic articles. Summarizing the research methods cited by academic papers helps in clarifying their evolution and development mode in the field. Thus, this paper divides research methods into those reported by and those cited in academic articles. First, this article compares a variety of automatic named entity recognition method, such as BiLSTM (bi-directional long short-term memory), from which an optimal model for final research method entities identification would be selected. The experimental results show that the character vector based BiLSTM joint training model combined with a CRF (conditional random field) yields the best performance. This paper analyzes research methods’ use in information science through the extracted research method entities. The results of statistical analysis show that the usage and citation frequency of experimental methods is the highest in information science.
|
Received: 13 October 2019
|
|
|
|
1 王芳, 王向女. 我国情报学研究方法的计量分析: 以1999~2008年《情报学报》为例[J]. 情报学报, 2010, 29(4): 652-662. 2 Chu H T, Ke Q. Research methods: What’s in the name?[J]. Library & Information Science Research, 2017, 39(4): 284-294. 3 王芳, 陈锋, 祝娜, 等. 我国情报学理论的来源、应用及学科专属度研究[J]. 情报学报, 2016, 35(11): 1148-1164. 4 赵洪, 王芳. 理论术语抽取的深度学习模型及自训练算法研究[J]. 情报学报, 2018, 37(9): 923-938. 5 Gupta S, Manning C. Analyzing the dynamics of research by extracting key aspects of scientific papers[C]// Proceedings of the 5th International Joint Conference on Natural Language Processing. Asian Federation of Natural Language Processing, 2011: 1-9. 6 Singh M, Dan S, Agarwal S, et al. AppTechMiner: Mining applications and techniques from scientific articles[C]// Proceedings of the Joint Conference on Digital Libraries Joint Conference on Digital Libraries, Toronto, ON, Canada, 2017: 1-8. 7 Kova?evi? A, Konjovi? Z, Milosavljevi? B, et al. Mining methodologies from NLP publications: A case study in automatic terminology recognition[J]. Computer Speech & Language, 2012, 26(2): 105-126. 8 蒋婷. 学科领域本体学习及学术资源语义标注研究[D]. 南京: 南京大学, 2017. 9 王曰芬. 文献计量法与内容分析法的综合研究[D]. 南京: 南京理工大学, 2007. 10 储荷婷. 图书馆情报学界的研究方法:实践与发展[J]. 国家图书馆学刊, 2014, 23(3): 3-14. 11 Howison J, Bullard J. Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature[J]. Journal of the Association for Information Science and Technology, 2016, 67(9): 2137-2155. 12 Heffernan K, Teufel S. Identifying problems and solutions in scientific text[J]. Scientometrics, 2018, 116(2): 1367-1382. 13 Zadeh B Q, Handschuh S. Investigating context parameters in technology term recognition[C]// Proceedings of the COLING Workshop on Synchronic and Diachronic Approaches to Analyzing Technical Language. Stroudsburg: Association for Computational Linguistics, 2014: 1-10. 14 程齐凯, 李信. 面向语义出版的学术文本词汇语义功能自动识别[J]. 数字图书馆论坛, 2017(8): 24-31. 15 Luan Y, Ostendorf M, Hajishirzi H. Scientific information extraction with semi-supervised neural tagging[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2017: 2641-2651. 16 Ammar W, Peters M, Bhagavatula C, et al. The AI2 system at SemEval-2017 Task 10 (ScienceIE): Semi-supervised end-to-end entity and relation extraction[C]// Proceedings of the 11th International Workshop on Semantic Evaluation. Stroudsburg: Association for Computational Linguistics, 2017: 592-596. 17 Song Y, Shi S M, Li J, et al. Directional skip-gram: Explicitly distinguishing left and right context for word embeddings[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 2: 175-180. 18 Mikolov T, Sutskever I, Chen K, et al. Distributed representations of words and phrases and their compositionality[C]// Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2013, 2: 3111-3119. 19 Jebbara S, Cimiano P. Improving opinion-target extraction with character-level word embeddings[C]// Proceedings of the First Workshop on Subword and Character Level Models in NLP. Stroudsburg: Association for Computational Linguistics, 2017: 159-167. 20 Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures[J]. Neural Networks, 2005, 18(5-6): 602-610. 21 Lample G, Ballesteros M, Subramanian S, et al. Neural architectures for named entity recognition[C]// Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2016: 260-270. 22 Zhang Q, Wang Y, Gong Y, et al. Keyphrase extraction using deep recurrent neural networks on twitter[C]// Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2016: 836-845. 23 Zhang H P, Yu H K, Xiong D Y, et al. HHMM-based Chinese lexical analyzer ICTCLAS[C]// Proceedings of the Second Workshop on Chinese Language Processing. Stroudsburg: Association for Computational Linguistics, 2003: 184-187. 24 Lafferty J D, McCallum A, Pereira F C N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data[C]// Proceedings of the Eighteenth International Conference on Machine Learning.San Francisco: Morgan Kaufmann Publishers, 2001: 282-289. 25 严怡民. 情报学概论[M]. 武汉: 武汉大学出版社, 1983. 26 包昌火. 情报研究方法论[M]. 北京: 科学技术文献出版社, 1990. |
|
|
|