|
|
Research on Patent Retrieval Strategy Based on BERT Word Embedding |
Zhou Xiao, Gao Yaqian, Fan Jiayi |
School of Economics and Management, Xidian University, Xi’an 710126 |
|
|
Abstract As a powerful tool for judging scientific and technological innovation capabilities and identifying market transformation trends, patent analysis is an important basis for a new round of technological revolution and industrial transformation in China. The establishment of a reasonable and efficient patent retrieval strategy is a prerequisite for patent analysis. This study entailed the development of a set of retrieval strategies based on deep learning algorithms, which addresses the shortcomings of insufficient dynamics and intelligence in existing research. The proposed model consists of two main parts: construction of retrieval strategy and revision of retrieval results. As regards the construction of retrieval strategies, this study aimed to systematically analyze the principle of technology composition, integrate the deep learning algorithm, and train the model from two dimensions of scientific and technological corpus and domain corpus so as to preliminarily screen and expand the search terms.
|
Received: 05 January 2023
|
|
|
|
1 赵阳, 文庭孝. 专利技术信息挖掘研究进展[J]. 图书馆, 2018(4): 28-36, 43. 2 陈琼娣. 基于词频分析的清洁技术专利检索策略研究[J]. 情报杂志, 2013, 32(6): 47-52. 3 程旖婕, 刘云, 闫哲, 等. 全球流感疫苗技术资源分布与发展特征研究[J]. 科研管理, 2017, 38(S1): 592-601. 4 Mahdabi P, Crestani F. Query-driven mining of citation networks for patent citation retrieval and recommendation[C]// Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management. New York: ACM Press, 2014: 1659-1668. 5 许侃, 林原, 林鸿飞, 等. 基于不同信息资源专利查询扩展方法的研究[J]. 情报学报, 2016, 35(6): 597-604. 6 Tannebaum W, Mahdabi P, Rauber A. Effect of log-based query term expansion on retrieval effectiveness in patent searching[C]// Proceedings of the International Conference of the Cross-Language Evaluation Forum for European Languages. Cham: Springer, 2015: 300-305. 7 耿爽, 杨辰, 牛奔, 等. 面向企业信息检索的语义扩展查询方法[J]. 情报学报, 2019, 38(7): 742-749. 8 Sarica S, Song B Y, Low E, et al. Engineering knowledge graph for keyword discovery in patent search[J]. Proceedings of the Design Society, 2019, 1(1): 2249-2258. 9 余传明, 蔡林, 胡莎莎, 等. 基于深度学习的查询扩展研究[J]. 情报学报, 2019, 38(10): 1066-1077. 10 许侃, 林原, 曲忱, 等. 专利查询扩展的词向量方法研究[J]. 计算机科学与探索, 2018, 12(6): 972-980. 11 Hofst?tter S, Rekabsaz N, Lupu M H, et al. Enriching word embeddings for patent retrieval with global context[C]// Proceedings of the European Conference on Information Retrieval. Cham: Springer, 2019: 810-818. 12 Russo D, Spreafico C, Avogadri S, et al. Investigating the impacts of misspellings in patent search by combining natural language tools and rule-based approaches[J]. Knowledge, 2022, 2(3): 487-507. 13 陈悦, 宋凯, 刘安蓉, 等. 基于机器学习的人工智能技术专利数据集构建新策略[J]. 情报学报, 2021, 40(3): 286-296. 14 李乾瑞, 郭俊芳, 黄颖, 等. 基于专利计量的颠覆性技术识别方法研究[J]. 科学学研究, 2021, 39(7): 1166-1175. 15 孔德婧, 董放, 陈子婧, 等. 离群专利视角下的新兴技术预测——基于BERT模型和深度神经网络[J]. 图书情报工作, 2021, 65(17): 131-141. 16 陆伟, 李鹏程, 张国标, 等. 学术文本词汇功能识别——基于BERT向量化表示的关键词自动分类研究[J]. 情报学报, 2020, 39(12): 1320-1329. 17 李乾瑞, 郭俊芳, 朱东华. 基于形态分析和模糊一致矩阵识别技术机会[J]. 科研管理, 2020, 41(7): 33-41. 18 Kwon S, Liu X Y, Porter A L, et al. Research addressing emerging technological ideas has greater scientific impact[J]. Research Policy, 2019, 48(9): 103834. 19 Petralia S. Mapping general purpose technologies with patent data[J]. Research Policy, 2020, 49(7): 104013. 20 郑方宇. 无线电原理与发展历程研究[J]. 数字通信世界, 2019(7): 138. 21 曹垒, 林先其, 陈越腾. 微波无线能量传输与收集应用系统的研究进展及发展趋势[J]. 空间电子技术, 2020, 17(2): 57-63. 22 Lee M, Kim S, Kim H, et al. Technology opportunity discovery using deep learning-based text mining and a knowledge graph[J]. Technological Forecasting and Social Change, 2022, 180: 121718. 23 Song K, Kim K, Lee S. Identifying promising technologies using patents: a retrospective feature analysis and a prospective needs analysis on outlier patents[J]. Technological Forecasting and Social Change, 2018, 128: 118-132. 24 郭颖, 王明星, 段炜钰. 专利的技术新兴度与其技术影响力间关系研究[J]. 科学学研究, 2022, 40(6): 1034-1043. |
|
|
|