|
|
An Extended Belief Network Retrieval Model Based on Document Relationships |
Xu Jianmin1, He Dandan1, Wu Shufang2 |
1.College of Cyberspace Security and Computer, Hebei University, Baoding 071002 2.School of Management, Hebei University, Baoding 071002 |
|
|
Abstract The performance of a retrieval model can be improved by using relationships among documents reasonably. To solve the problem of a basic retrieval model that does not use document relationships, an extended belief network retrieval model with two layers of document nodes is proposed. The topology structure and probability inference of the extended model are given. In the topology structure, relationships between items like terms and queries, terms and documents, and two layers of documents, are indicated by arcs. The relationship between documents is determined by the similarity of the documents. In the probability inference, the retrieval probability is made more accurate by using document similarity and the number of parent documents to modify the original probability. In experiments, the value of discounted cumulative gain and the precision-recall curve are introduced to attest to the performance of our proposed extended model. The results show that the extended model makes the ranking of related documents more reasonable and improves the precision under the premise of guaranteeing recall.
|
Received: 12 December 2018
|
|
|
|
1 王俊义. 正负相关反馈与查询扩展技术的研究[D]. 呼和浩特: 内蒙古大学, 2012. 2 刘亚楠. 基于文档相似度的伪相关反馈方法研究[D]. 武汉: 华中师范大学, 2017. 3 李卫疆, 赵铁军, 王宪刚. 基于上下文的查询扩展[J]. 计算机研究与发展, 2010, 47(2): 300-304. 4 郑伟, 侯宏旭, 武静. 贝叶斯网络在信息检索中的应用[J]. 情报科学, 2018, 36(6): 136-141. 5 Ribeiro-NetoB A, MuntzR R. A belief network model for IR[C]// Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 1996: 253-260. 6 吴树芳, 朱杰. 信念网络在话题识别与追踪中的应用研究[M]. 北京: 科学出版社, 2017: 12-13. 7 de CamposL M, Fernández-LunaJ M, HueteJ F. Two term-layers: An alternative topology for representing term relationships in the Bayesian network retrieval model[M]// Benitez J, Cordón O, Hoffmann F, et al.(Eds.) Advances in Soft Computing. London: Springer-Verlag, 2003: 213-224. 8 de CamposL M, Fernández-LunaJ M, HueteJ F. Clustering terms in the Bayesian network retrieval model: A new approach with two term-layers[J]. Applied Soft Computing, 2004, 4(2): 149-158. 9 XuJ M, TangW S, XuJ M, et al. A word similarity based belief network IR model with two term layers[C]// Proceedings of the WRI Global Congress on Intelligent Systems. New York: IEEE Computer Society, 2009: 514-517. 10 徐建民, 陈振亚, 白彦霞. 利用查询术语同义词关系扩展信念网络检索模型[J]. 情报学报, 2008, 27(3): 363-368. 11 AcidS, de CamposL M, Fernández-LunaJ M, et al. An information retrieval model based on simple Bayesian networks[J]. International Journal of Intelligent Systems, 2003, 18(2): 251-265. 12 张仰森, 郑佳, 李佳媛. 一种基于语义关系图的词语语义相关度计算模型[J]. 自动化学报, 2018, 44(1): 87-98. 13 刘宏哲, 须德. 基于本体的语义相似度和相关度计算研究综述[J]. 计算机科学, 2012, 39(2): 8-13. 14 谷重阳, 徐浩煜, 周晗, 等. 基于词汇语义信息的文本相似度计算[J]. 计算机应用研究, 2018, 35(2): 391-395. 15 LinY S, JiangJ Y, LeeS J. A similarity measure for text classification and clustering[J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(7): 1575-1590. 16 徐建民, 许彩云. 基于文本和公式的科技文档相似度计算[J]. 数据分析与知识发现, 2018, 2(10): 103-109. 17 俞婷婷, 徐彭娜, 江育娥, 等. 基于改进的Jaccard系数文档相似度计算方法[J]. 计算机系统应用, 2017, 26(12): 137-142. 18 徐建民, 王平. 小型中文信息检索测试集的构建与分析[J]. 情报杂志, 2009, 28(1): 13-16. 19 陈振亚, 徐建民, 吴树芳. 利用术语本体关系扩展SBN检索模型[J]. 计算机研究与发展, 2013, 50(S1): 257-263. 20 徐建民, 崔琰, 刘清江. 基于同义词关系改进的局部共现查询扩展[J]. 情报杂志, 2010, 29(9): 145-147. 21 WenJ R, NieJ Y, ZhangH J. Clustering user queries of a search engine[C]// Proceedings of the 10th International Conference on World Wide Web. New York: ACM Press, 2001: 162-168. |
|
|
|