1.Institute of Scientific and Technical Information of China, Beijing 100038 2.Key Laboratory of Rich-media Knowledge Organization and Service of Digital Publishing Content, National Press and Publication Administration, Beijing 100038
1 肖雯, 李鑫. 大数据时代数字资源的主题标引研究[J]. 图书馆理论与实践, 2016(11): 67-70. 2 张静. 自动标引技术的回顾与展望[J]. 现代情报, 2009, 29(4): 221-225. 3 章成志. 自动标引研究的回顾与展望[J]. 现代图书情报技术, 2007(11): 33-39. 4 中国科学技术信息研究所. 汉语主题词表(工程技术卷)[M]. 北京: 科学技术文献出版社, 2014. 5 衣芳. 《中分表(2版)》主题标引若干问题分析[J]. 神州, 2019(12): 235. 6 曹树金, 陈桂鸿, 陈忆金. 网络舆情主题标引算法与实现[J]. 图书情报知识, 2012(1): 52-59, 73. 7 Liu J Z, Chang W C, Wu Y X, et al. Deep learning for extreme multi-label text classification[C]// Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2017: 115-124. 8 李素建, 王厚峰, 俞士汶, 等. 关键词自动标引的最大熵模型应用研究[J]. 计算机学报, 2004, 27(9): 1192-1197. 9 柯平, 赵益民. 从关键词与高频词的相关度看自动标引的可行性[J]. 情报科学, 2009, 27(3): 326-328, 333. 10 丁芹. 基于格式语义格的自动标引和词相似度计算[J]. 情报理论与实践, 2004, 27(4): 363-366. 11 赵丹. 基于句法分析的主题标引规则分析[D]. 太原: 山西大学, 2017. 12 章成志. 基于集成学习的自动标引方法研究[J]. 情报学报, 2010, 29(1): 3-8. 13 王新. 基于神经网络的文献主题国别标引方法研究[J]. 数字图书馆论坛, 2019(7): 39-47. 14 陈博, 陈建龙. 基于文本挖掘和可视化技术的主题自动标引方法——以《英雄格萨尔》为例[J]. 现代情报, 2019, 39(8): 45-51, 102. 15 Mork J G, Jimeno Yepes A J, Aronson A. The NLM medical text indexer system for indexing biomedical literature[R/OL]. BioASQ Workshop, 2013. http://bioasq.org/sites/default/files/Mork.pdf. 16 Mork J, Aronson A, Demner-Fushman D. 12 years on - Is the NLM medical text indexer still useful and relevant?[J]. Journal of Biomedical Semantics, 2017, 8(1): 8. 17 Liu K, Peng S W, Wu J Q, et al. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence[J]. Bioinformatics, 2015, 31(12): i339-i347. 18 Peng S W, You R H, Wang H N, et al. DeepMeSH: deep semantic representation for improving large-scale MeSH indexing[J]. Bioinformatics, 2016, 32(12): i70-i79. 19 Dai S Y, You R H, Lu Z Y, et al. FullMeSH: improving large-scale MeSH indexing with full text[J]. Bioinformatics, 2020, 36(5): 1533-1541. 20 Xun G X, Jha K, Yuan Y, et al. MeSHProbeNet: a self-attentive probe net for MeSH indexing[J]. Bioinformatics, 2019, 35(19): 3794-3802. 21 Rae A R, Mork J G, Demner-Fushman D. Convolutional neural network for automatic MeSH indexing[C]// Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Cham: Springer, 2020: 581-594. 22 You R H, Liu Y X, Mamitsuka H, et al. BERTMeSH: deep contextual representation learning for large-scale high-performance MeSH indexing with full text[J]. Bioinformatics, 2021, 37(5): 684-692. 23 Xun G X, Jha K, Zhang A D. MeSHProbeNet-P: improving large-scale MeSH indexing with personalizable MeSH probes[J] ACM Transactions on Knowledge Discovery from Data, 2021, 15(1): Article No.11. 24 李纲, 戴强斌. 基于词汇链的关键词自动标引方法[J]. 图书情报知识, 2011(3): 67-71. 25 Gil-Leiva I. SISA—automatic indexing system for scientific articles: experiments with location heuristics rules versus TF-IDF rules[J]. Knowledge Organization, 2017, 44(3): 139-162. 26 陈白雪, 宋培彦. 基于用户自然标注的TF-IDF辅助标引算法及实证研究[J]. 图书情报工作, 2018, 62(1): 132-139. 27 唐晓波, 翟夏普. 基于本体和Word2Vec的文本知识片段语义标引[J]. 情报科学, 2019, 37(4): 97-102. 28 李枫林, 柯佳. 词向量语义表示研究进展[J]. 情报科学, 2019, 37(5): 155-165. 29 潘俊, 吴宗大. 词汇表示学习研究进展[J]. 情报学报, 2019, 38(11): 1222-1240. 30 郁可人, 傅云斌, 董启文. 基于神经网络语言模型的分布式词向量研究进展[J]. 华东师范大学学报(自然科学版), 2017(5): 52-65, 79. 31 Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space[OL]. (2013-09-07). https://arxiv.org/pdf/1301.3781.pdf. 32 Pennington J, Socher R, Manning C. GloVe: global vectors for word representation[C]// Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2014: 1532-1543. 33 Peters M, Neumann M, Iyyer M, et al. Deep contextualized word representations[C]// Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2018, 1: 1-15. 34 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019, 1: 4171-4186. 35 Le Q, Mikolov T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on International Conference on Machine Learning. JMLR.org, 2014, 32: II-1188-II-1196. 36 张海超, 赵良伟. 利用Doc2Vec判断中文专利相似性[J]. 情报工程, 2018, 4(2): 64-72. 37 中国科学技术信息研究所. 文本主题标引方法、装置、电子设备及计算机存储介质: CN201910970014.9[P]. 2020-01-24. 38 Yang Z L, Dai Z H, Yang Y M, et al. XLNet: generalized autoregressive pretraining for language understanding[C]// Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2019: 5753-5763.