|
|
Research on Query Expansion Based on Deep Learning |
Yu Chuanming1, Cai Lin1, Hu Shasha1, An Lu2 |
1.School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073 2.School of Information Management, Wuhan University, Wuhan 430072 |
|
|
Abstract By introducing a deep learning framework into query expansion and combining both local and global query expansion, the query drift problem caused by pseudo relevance feedback in the query expansion is solved. Query phrases and product names posted on eBay in 2017 are selected as experimental data. A deep learning query expansion model (DLQEM) is proposed to achieve more accurate and effective query expansion based on pseudo relevance feedback and is applied to information retrieval. The experimental results show that the precision@10 value of the DLQEM is improved by 3.5% and 3.7% on the basis of pseudo relevance feedback (PRF), which validates the hypothesis proposed in this study (i.e., that the intersection of concept-related extended words and feedback information extended words can effectively control the query drift caused by feedback-related extended words). Deep learning can solve the problem of the difficulty of supervised learning in obtaining good classification results on short-text corpus. Combining deep learning with the traditional query expansion model can solve the two disadvantages of the traditional query expansion model that requires user participation and low retrieval speed, and it controls the query drift.
|
Received: 27 August 2018
|
|
|
|
1 韩中元, 杨沐昀, 孔蕾蕾, 等. 基于词汇时间分布的微博查询扩展[J]. 计算机学报, 2016, 39(10): 2031-2044. 2 GuoJ F, ZhuX F, LanY Y, et al. Modeling userssearch sessions for high utility query recommendation[J]. Information Retrieval Journal, 2017, 20(1): 4-24. 3 吴龑, 张奇, 黄萱菁. 基于整数线性规划的查询扩展[J]. 计算机研究与发展, 2013, 50(8): 1737-1743. 4 仲兆满, 李存华, 胡云. 基于迭代策略的微博事件查询扩展方法[J]. 情报学报, 2015, 34(9): 978-990. 5 HintonG, OsinderoS, WellingM, et al. Unsupervised discovery of nonlinear structure using contrastive backpropagation[J]. Cognitive Science, 2006, 30(4): 725-731. 6 SongY, ZhangS L, WangJ N. The research on personalized query expansion technology[EB/OL]. [2017-12-30]. https://www.researchgate.net/publication/308630964_The_Research_on_Personalized_Query_Expansion_Technology. 7 JonnagaddalaJ, JueT R, ChangN W, et al. Improving the dictionary lookup approach for disease normalization using enhanced dictionary and query expansion[EB/OL]. [2017-12-30]. https://www.ncbi.nlm.nih.gov/pmc/articles/ PMC4976299/pdf/baw112.pdf. 8 徐也, 徐蔚然. 基于语义特征扩展的知识库增量引文推荐算法[J]. 山东大学学报(理学版), 2016, 51(11): 26-32. 9 许侃, 林原, 林鸿飞, 等. 基于不同信息资源专利查询扩展方法的研究[J]. 情报学报, 2016, 35(6): 597-604. 10 RocchioJ J. Relevance feedback in information retrieval[M]// The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, 1971: 313-323. 11 查正军, 郑晓菊. 多媒体信息检索中的查询与反馈技术[J]. 计算机研究与发展, 2017, 54(6): 1267-1280. 12 黄名选. 完全加权模式挖掘与相关反馈融合的印尼汉跨语言查询扩展[J]. 小型微型计算机系统, 2017, 38(8): 1783-1791. 13 马云龙. 查询理解与正负双向相关反馈技术研究[D]. 大连: 大连理工大学, 2016. 14 SinghJ, PrasadM, PrasadO K, et al. A novel fuzzy logic model for pseudo-relevance feedback-based query expansion[J]. International Journal of Fuzzy Systems, 2016, 18(6): 980-989. 15 KeikhaA, EnsanF, BagheriE. Query expansion using pseudo relevance feedback on Wikipedia[J]. Journal of Intelligent Information Systems, 2018, 50(3): 455-478. 16 AlmasriM, BerrutC, ChevalletJ P. A comparison of deep learning based query expansion with pseudo-relevance feedback and mutual information[C]// Proceedings of European Conference on Information Retrieval . Padua: ECIR Press, 2016: 709-715. 17 李岩, 张博文, 郝红卫. 基于语义向量表示的查询扩展方法[J]. 计算机应用, 2016, 36(9): 2526-2530, 2539. 18 唐亮, 赵晓峰, 席耀一, 等. 融合局部共现和上下文相似度的查询扩展方法[J]. 山东大学学报(理学版), 2017, 52(1): 29-36. 19 Sparck JonesK. A statistical interpretation of term specificity and its application in retrieval[J]. Journal of Documentation, 1972, 28(1): 11-21. 20 王卫国, 徐炜民. 基于潜在语义分析的个性化查询扩展模型[J]. 计算机工程, 2010, 36(21): 43-45. 21 El GhaliB, El QadiA. Context-aware query expansion method using language models and latent semantic analyses[J]. Knowledge and Information Systems, 2017, 50(3): 751-762. 22 WangF, LinL F, YangS, et al. A semantic query expansion-based patent retrieval approach[C]// Proceedings of the International Conference on Fuzzy Systems and Knowledge Discovery. New York: IEEE, 2013: 572-577. 23 SloanM, YangH, WangJ. A term-based methodology for query reformulation understanding[M]. Kluwer Academic Publishers, 2015: 145-165. 24 RoyD, GangulyD, MitraM, et al. Word vector compositionality based relevance feedback using kernel density estimation[C]// Proceedings of the 25th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2016: 1281-1290. 25 DíazF R, MitraB, CraswellN. Query expansion with locally-trained word embeddings[OL]. [2017-06-30]. https://arxiv.org/pdf/1605.07891v2.pdf. 26 齐爱芹, 徐蔚然. 基于词向量的实体链接方法[J]. 数据采集与处理, 2017, 32(3): 604-611. 27 KuziS, ShtokA, KurlandO. Query expansion using word embeddings[C]// Proceedings of the 25th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2016: 1929-1932. 28 许侃, 林原, 曲忱, 等. 专利查询扩展的词向量方法研究[J]. 计算机科学与探索, 2018, 12(6): 972-980. 29 AttarR, FraenkelA S. Local feedback in full-text retrieval systems[J]. Journal of the ACM, 1977, 24(3): 397-417. 30 FurnasG W, LandauerT K, GomezL M, et al. Statistical semantics: Analysis of the potential performance of keyword information systems[C]// Proceedings of the Conference on Human Factors in Computer Systems. New York: Ablex Publishing Corp., 1984: 187-242. 31 HochreiterS, SchmidhuberJ. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. 32 KimY. Convolutional neural networks for sentence classification[OL]. [2017-06-30]. https://arxiv.org/pdf/1408.5882. |
|
|
|