|
|
Detection of Hotspot in Scientific Fields Based on Emerging Pattern Analysis of Social Q&A Community Contents |
Yu Jing |
Department of Politics, East China Normal University, Shanghai 200241 |
|
|
Abstract Hot spot identification of scientific fields is one of the key research issues in the fields of science and technology intelligence and bibliometrics. It can constitute a reference and basis for policy-making of science and technology or education departments or research decision-making for researchers. Methods of hot spot identification in existing studies are primarily based on bibliometrics methods, without using abundant Web data. This study proposes a hot spot identification framework based on Emerging Pattern Recognition, which uses the question and answer content of Social Q&A Community to identify research hotspots in a field. First, keywords in the question and answer contents are extracted and clustered based on their co-occurrence. Second, a set of candidate hotspot patterns is constructed based on the clustering results, and emerging pattern recognition method is used to identify hotspots and analyze their trends. Experiments based on the dataset from the “Machine Learning” topic of zhihu.com were analyzed using the chi-square test and compared with frontier research. The results indicated that this framework can effectively identify research hotspots. This framework has strong realizability in that it alleviates the high computational complexity problem of the Emerging Pattern Recognition method by using keywords clustering. Moreover, it has potential application value in online community hotspot identification and other related issues.
|
Received: 10 January 2020
|
|
|
|
1 黄晓斌, 吴高. 学科领域研究前沿探测方法研究述评[J]. 情报学报, 2019, 38(8): 872-880. 2 Shibata N, Kajikawa Y, Takeda Y, et al. Detecting emerging research fronts based on topological measures in citation networks of scientific publications[J]. Technovation, 2008, 28(11): 758-775. 3 Schiebel E. Visualization of research fronts and knowledge bases by three-dimensional areal densities of bibliographically coupled publications and co-citations[J]. Scientometrics, 2012, 91(2): 557-566. 4 Liao H C, Tang M, Li Z M, et al. Bibliometric analysis for highly cited papers in operations research and management science from 2008 to 2017 based on Essential Science Indicators[J]. Omega, 2019, 88: 223-236. 5 陈晓玲, 刘东亮. 基于科学知识图谱的东北三省区域研究热点分析[J]. 情报学报, 2018, 37(12): 1224-1231. 6 吕鹏辉, 李晶晶, 杨善林. 科学创新视角下的学科共词网络演化研究[J]. 情报学报, 2016, 35(11): 1165-1172. 7 Wang J J, Zhao X, Guo X X, et al. Analyzing the research subjects and hot topics of power system reliability through the Web of Science from 1991 to 2015[J]. Renewable and Sustainable Energy Reviews, 2018, 82: 700-713. 8 杨颖, 许丹, 陈斯斯, 等. 基于自然指数刊文数据对全球医学研究领域热点的探析[J]. 情报学报, 2019, 38(11): 1129-1137. 9 Xie P. Study of international anticancer research trends via co-word and document co-citation visualization analysis[J]. Scientometrics, 2015, 105(1): 611-622. 10 Chen C M. CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature[J]. Journal of the American Society for Information Science and Technology, 2006, 57(3): 359-377. 11 Kim J, Hwang M, Jeong D H, et al. Technology trends analysis and forecasting application based on decision tree and statistical feature analysis[J]. Expert Systems with Applications, 2012, 39(16): 12618-12625. 12 高盈盈, 杨克巍, 徐建国, 等. 面向装备试验鉴定领域的研究热点识别与发展预测[J]. 系统工程, 2018, 36(10): 137-144. 13 王晓光, 王宏宇, 黄菡. 基于多源数据的专业领域热点探测模型研究[J]. 图书情报工作, 2019, 63(14): 52-61. 14 阮光册, 夏磊. 基于Doc2Vec的期刊论文热点选题识别[J]. 情报理论与实践, 2019, 42(4): 107-111,106. 15 Le Q, Mikolov T. Distributed representations of sentences and documents[C]// Proceedings of the 31st International Conference on Machine Learning. JMLR.org: 2014, 32: II-1188-II-1196. 16 Asatani K, Mori J, Ochi M, et al. Detecting trends in academic research from a citation network using network representation learning[J]. PLoS One, 2018, 13(5): e0197260. 17 涂存超, 杨成, 刘知远, 等. 网络表示学习综述[J]. 中国科学: 信息科学, 2017, 47(8): 980-996. 18 Srba I, Bielikova M. Why is stack overflow failing? preserving sustainability in community question answering[J]. IEEE Software, 2016, 33(4): 80-89. 19 张宝生, 张庆普. 基于扎根理论的社会化问答社区用户知识贡献行为意向影响因素研究[J]. 情报学报, 2018, 37(10): 1034-1045. 20 张颖, 朱庆华. 付费知识问答社区中提问者的答主选择行为研究[J]. 情报理论与实践, 2018, 41(12): 21-26. 21 Wu H C, Tian Z H, Wu W, et al. An unsupervised approach for low-quality answer detection in community question-answering[C]// Proceedings of the International Conference on Database Systems for Advanced Applications. Cham: Springer, 2017: 85-101. 22 陈晨, 侯景瑞, 吴任力, 等. 基于多源混合标签的社会化问答社区问题推荐方法研究[J]. 情报科学, 2019, 37(7): 139-145. 23 Wang J, Sun J Q, Lin H F, et al. Convolutional neural networks for expert recommendation in community question answering[J]. Science China Information Sciences, 2017, 60(11): 110102. 24 Zhang Z F, Li Q D. QuestionHolic: hot topic discovery and trend analysis in community question answering systems[J]. Expert Systems with Applications, 2011, 38(6): 6848-6855. 25 Lu Y J, Zhang P Z, Liu J F, et al. Health-related hot topic detection in online communities using text clustering[J]. PLoS One, 2013, 8(2): e56221. 26 Bailey J, Manoukian T, Ramamohanarao K. Fast algorithms for mining emerging patterns[C]// Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery. Heidelberg: Springer, 2002: 39-50. 27 Dong G Z, Li J Y. Mining border descriptions of emerging patterns from dataset pairs[J]. Knowledge and Information Systems, 2005, 8(2): 178-202. 28 García-Borroto M, Martínez-Trinidad J F, Carrasco-Ochoa J A, et al. LCMine: an efficient algorithm for mining discriminative regularities and its application in supervised classification[J]. Pattern Recognition, 2010, 43(9): 3025-3034. 29 García-Vico A M, Carmona C J, Martín D, et al. An overview of emerging pattern mining in supervised descriptive rule discovery: taxonomy, empirical study, trends, and prospects[J]. WIREs: Data Mining and Knowledge Discovery, 2018, 8(1): e1231. 30 Kane B, Cuissart B, Crémilleux B. Minimal jumping emerging patterns: computation and practical assessment[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer, 2015: 722-733. 31 Wang Z, Fan H J, Ramamohanarao K. Exploiting maximal emerging patterns for classification[C]// Proceedings of the 17th Australian Joint Conference on Artificial Intelligence. Heidelberg: Springer, 2004: 1062-1068. 32 Luo R X, Xu J J, Zhang Y, et al. PKUSEG: a toolkit for multi-domain Chinese word segmentation[OL]. (2019-06-28). https://arxiv.org/pdf/1906.11455.pdf. 33 周磊, 杨威, 张玉峰. 共现矩阵聚类分析的问题与再思考[J]. 情报杂志, 2014, 33(6): 32-36,27. 34 Ward J H. Hierarchical grouping to optimize an objective function[J]. Journal of the American Statistical Association, 1963, 58(301): 236-244. |
|
|
|