|
|
Methods of Extracting Non-Categorical Semantic Relations between Chinese Terms |
Zhu Hui1, 2, Wang Hao1, 2, Su Xinning1, 2, Deng Sanhong1, 2 |
1. School of Information Management, Nanjing University, Nanjing 210023; 2. Jiangsu Key Laboratory of Data Engineering and Knowledge Services, Nanjing University, Nanjing 210023 |
|
|
Abstract Ontology is an effective method of knowledge organization, and it is also the important link in constructing the Semantic Web. The non-categorical semantic relations between concepts are important parts of ontology. Because the term is the external expression of a concept, this paper introduces co-occurrence, structural analysis, template construction, logical reasoning, and other methods to construct a model that can extract non-categorical semantic relations between terms from Chinese unstructured texts. The model extracts the relations from two different perspectives: content and structure. The paper puts forward the main operation flow of the model and the main components of each functional module, discusses the specific realization of the main components, and discusses the limitations of the methods. The research will provide new ideas for the extraction of non-categorical semantic relations between terms, enrich the methods of knowledge discovery, and provide references for the implementation of feasible and effective knowledge organization.
|
Received: 15 October 2018
|
|
|
|
[1] 季培培, 鄢小燕, 岑咏华. 面向领域中文文本信息处理的术语识别与抽取研究综述[J]. 图书情报工作, 2010, 54(16): 124-129.
[2] Castellvi M T C, Bagot R E, Palatresi J V. Automatic term detection: A review of current systems[M]// Recent Advances in Computational Terminology. 2001: 53-88.
[3] 刘豹, 张桂平, 蔡东风. 基于统计和规则相结合的科技术语自动抽取研究[J]. 计算机工程与应用, 2008, 44(23): 147-150.
[4] 翟笃风, 刘柏嵩. 政务领域本体术语的自动抽取[J]. 现代图书情报技术, 2010, 26(4): 59-65.
[5] 张雷瀚, 吕学强, 李卓, 等. 领域本体术语的抽取方法研究[J]. 情报学报, 2014, 33(2): 167-174.
[6] 袁劲松, 张小明, 李舟军. 术语自动抽取方法研究综述[J]. 计算机科学, 2015, 42(8): 7-12.
[7] Vivaldi J, Rodriguez H. Evaluation of terms and term extraction systems: A practical approach[J]. Terminology, 2007, 13(2): 225-248.
[8] Bolshakova E, Loukachevitch N, Nokel M. Topic models can improve domain term extraction[C]// Proceedings of the European Conference on Information Retrieval. Heidelberg: Springer, 2013: 684-687.
[9] Gelbukh A, Sidorov G, Lavin-Villa E, et al. Automatic term extraction using log-likelihood based comparison with general reference corpus[C]// Proceedings of the International Conference on Application of Natural Language to Information Systems. Heidelberg: Springer, 2010: 248-255.
[10] 温春, 王晓斌, 石昭祥. 中文领域本体学习中术语的自动抽取[J]. 计算机应用研究, 2009, 26(7): 2652-2655.
[11] 周浪, 张亮, 冯冲, 等. 基于词频分布变化统计的术语抽取方法[J].计算机科学, 2009, 36(5): 177-180.
[12] 周浪, 史树敏, 冯冲, 等. 基于多策略融合的中文术语抽取方法[J]. 情报学报,2010, 29(3): 460-467.
[13] 杨双龙, 吕学强, 李卓, 等. 中文专利文献术语自动识别研究[J]. 中文信息学报, 2016, 30(3): 111-124.
[14] 岑咏华, 韩哲, 季培培. 基于隐马尔科夫模型的中文术语识别研究[J]. 现代图书情报技术, 2008(12): 54-58.
[15] 王海雄, 郭剑毅, 余正涛, 等. 基于CRFs 的中文领域术语自动抽取研究[C]// 第六届全国信息检索学术会议论文集. 北京: 中国中文信息学会, 2010: 505 -512.
[16] Agarwal M, Goutam R, Jain A, et al. Comparative analysis of the performance of CRF, HMM and MaxEnt for part-of-speech tagging, chunking and named entity recognition for a morphologically rich language[C]// Proceedings of the Pacific Association for Computational Lingustics, 2011.
[17] Zheng D, Zhao T, Yang J. Research on domain term extraction based on conditional random fields[C]// Proceedings of the International Conference on Computer Processing of Oriental Languages. Heidelberg: Springer, 2009: 290-296.
[18] Li L S, Dang Y Z, Zhang J, et al. Domain term extraction based on conditional random fields combined with active learning strategy[J]. Journal of Information and Computational Science, 2012, 9(7): 1931-1940.
[19] Girju R, Moldovan D I. Text mining for causal relations[C]// Proceedings of the Fifteenth International Florida Artificial Intelligence Research Society Conference. Palo Alto: AAAI Press, 2002: 360-364.
[20] Morin E, Jacquemin C. Automatic acquisition and expansion of hypernym links[J]. Computers and the Humanities, 2004, 38(4): 363-396.
[21] 汤青, 吕学强, 李卓. 本体概念间上下位关系抽取研究[J]. 微电子学与计算机, 2014, 31(6): 68-71.
[22] 陈珂. 构造领域本体概念关系的自动抽取[D]. 上海: 上海交通大学, 2008.
[23] Lee S, Huh S Y, Mcniel R D. Automatic generation of concept hierarchies using WordNet[J]. Expert Systems with Applications, 2008, 35(3): 1132-1144.
[24] 涂鼎, 陈岭, 陈根才, 等. 基于多路层次聚类的商品评论数据概念分类构建[J]. 计算机研究与发展, 2013, 50(S2): 208-215.
[25] 贾文娟, 何丰. 基于HowNet的中文本体学习方法研究[J]. 计算机技术与发展, 2011, 21(6): 77-80.
[26] 王龙甫. 基于中文百科的概念知识库构建[D]. 浙江: 浙江大学, 2015.
[27] Miller G A, Charles W G. Contextual correlates of semantic similarity[J]. Language and Cognitive Processes, 1991, 6(1): 1-28.
[28] de Knijff J, Frasincar F, Hogenboom F. Domain taxonomy learning from text: The subsumption method versus hierarchical clustering[J]. Data & Knowledge Engineering, 2013, 83: 54-69.
[29] 彭成, 季佩佩. 基于确定性退火的中文术语语义层次关联研究[J]. 计算机应用研究, 2011, 28(9): 3235-3238.
[30] 谷俊, 朱紫阳. 基于聚类算法的本体层次关系获取研究[J]. 现代图书情报技术, 2011(12): 46-51.
[31] 温春, 石昭祥, 杨国正. 一种利用度属性获取本体概念层次的方法[J]. 小型微型计算机系统, 2010, 31(2): 322-326.
[32] 董丽丽, 胡云飞, 张翔. 一种领域概念非分类关系的获取方法[J]. 计算机工程与应用, 2013, 49(4): 157-161.
[33] 王红, 高斯婷, 潘振杰, 等. 基于NNV关联规则的非分类关系提取方法及其应用研究[J]. 计算机应用研究, 2012, 29(10): 3665-3668.
[34] 谷俊, 严明, 王昊. 基于改进关联规则的本体关系获取研究[J]. 情报理论与实践, 2011, 34(12): 121-125.
[35] Mei K W, Abidi S S R, Jonsen I D. A multi-phase correlation search framework for mining non-taxonomic relations from unstructured text[J]. Knowledge and Information Systems, 2014, 38(3): 641-667.
[36] 古凌岚, 孙素云. 基于语义依存的中文本体非分类关系抽取方法[J]. 计算机工程与设计, 2012, 33(4): 1676-1680.
[37] 张立国, 陈荔. 维基百科中基于语义依存的领域本体非分类关系获取方法研究[J]. 情报科学, 2014, 32(6): 93-97.
[38] 王岁花, 赵爱玲, 马巍巍. 从Web中提取中文本体非分类关系的方法[J]. 计算机工程与设计, 2010, 31(2): 451-454.
[39] 何宇, 吕学强, 刘秀磊, 等. 中文专利领域本体概念间非分类关系抽取[J]. 计算机工程与设计, 2017, 38(1): 97-102.
[40] Sánchez D, Moreno A. Learning non-taxonomic relationships from web documents for domain ontology construction[J]. Data & Knowledge Engineering, 2008, 64(3): 600-623.
[41] Villaverde J, Persson A, Godoy D, et al. Supporting the discovery and labeling of non-taxonomic relationships in ontology learning[J]. Expert Systems with Applications, 2009, 36(7): 10288-10294.
[42] Weichselbraun A, Wohlgenannt G, Scharl A. Refining non- taxonomic relation labels with external structured data to support ontology learning[J]. Data & Knowledge Engineering, 2010, 69(8): 763-778. |
|
|
|