|
|
Identifying and Visualizing Emerging Trends in Domain Based on PWLR Model |
Liu Ziqiang1, Hu Zhengyin2,3, Xu Haiyun2,3, Fang Shu2,3 |
1.School of Journalism and Communication, Nanjing Normal University, Nanjing 210097 2.Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041 3.Department of Library Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190 |
|
|
Abstract Exploring and constructing an accurate and effective framework for the analysis of emerging trends is of great significance to the information work of emerging trends judgment and research, public opinion monitoring, and so on. First, Bi-Gram and Tri-Gram, which are multi-lexical features, are extracted based on an N-Gram model, and a piecewise linear regression (PWLR) model then is used to fit these features. In addition, the model is used to detect the emerging multi-lexical features on the recent time line as well as to accurately identify new words with potential for development and identify new words based on the previous step. A hierarchical clustering algorithm is used to identify emerging trends in the field and visualize the results. Through empirical research, the main emerging trends in the field of gene editing are identified as CRISPR-Cas9 technology, gene therapy, and animal and plant gene editing, which verifies the feasibility and validity of the method proposed herein.
|
Received: 20 October 2019
|
|
|
|
1 Richards J, Ma M, Rosenberg A. Using word burst analysis to rescore keyword search candidates on low-resource languages[C]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2014: 7824-7828. 2 逯万辉, 马建霞, 赵迎光. 爆发词识别与主题探测技术研究综述[J]. 情报理论与实践, 2012, 35(6): 125-128. 3 白如江, 冷伏海. k-clique社区知识创新演化方法研究[J]. 图书情报工作, 2013, 57(17): 86-94. 4 Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2003, 3: 993-1022. 5 Blei D M, Lafferty J. Dynamic topic models[C]// Proceedings of the 23rd International Conference on Machine Learning. NewYork: ACM Press, 2006: 113-120. 6 范云满, 马建霞. 基于LDA与新兴主题特征分析的新兴主题探测研究[J]. 情报学报, 2014, 33(7): 698-711. 7 陈伟, 林超然, 李金秋, 等. 基于LDA-HMM的专利技术主题演化趋势分析——以船用柴油机技术为例[J]. 情报学报, 2018, 37(7): 732-741. 8 M?rchen F, Dejori M, Fradkin D, et al. Anticipating annotations and emerging trends in biomedical literature[C]// Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2008: 954-962. 9 Ohniwa R L, Hibino A, Takeyasu K. Trends in research foci in life science fields over the last 30 years monitored by emerging topics[J]. Scientometrics, 2010, 85(1): 111-127. 10 Kreuchauff F, Korzinov V. A patent search strategy based on machine learning for the emerging field of service robotics[J]. Scientometrics, 2017, 111(2): 743-772. 11 Kleinberg J. Bursty and hierarchical structure in streams[C]// Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2002: 91-101. 12 Kleinberg J. Bursty and hierarchical structure in streams[J]. Data Mining and Knowledge Discovery, 2003, 7(4): 373-397. 13 Chen C M. Searching for intellectual turning points: Progressive knowledge domain visualization[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(Suppl 1): 5303-5310. 14 Chen C M. CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature[J]. Journal of the American Society for Information Science and Technology, 2006, 57(3): 359-377. 15 吴晓阳. 网络舆论的主题探测、追踪与关键技术[J]. 电子技术与软件工程, 2016(23): 34. 16 陈国兰. 基于爆发词识别的微博突发事件监测方法研究[J]. 情报杂志, 2014, 33(9): 123-128. 17 逯万辉, 马建霞. 基于CRFs的领域爆发词识别的研究与实现[J]. 情报科学, 2014, 32(1): 89-93. 18 王曰芬, 李冬琼, 余厚强. 生命周期阶段中的科学合作网络演化及高影响力学者成长特征研究[J]. 情报学报, 2018, 37(2): 121-131. 19 关鹏, 王曰芬. 基于LDA主题模型和生命周期理论的科学文献主题挖掘[J]. 情报学报, 2015, 34(3): 286-299. 20 Small H, Boyack K W, Klavans R. Identifying emerging topics in science and technology[J]. Research Policy, 2014, 43(8): 1450-1467. 21 白如江, 冷伏海, 廖君华. 科学研究前沿探测主要方法比较与发展趋势研究[J]. 情报理论与实践, 2017, 40(5): 33-38. 22 尹福生, 夏阿林, 莫扬. 人民币名义汇率与实际汇率的市场分割现象——基于结构突变理论的经验证据[J]. 产经评论, 2016, 7(3): 153-160. 23 赵旭, 胡斌, 夏泥. 基于突变理论的员工主动离职行为定性模拟——以中国经济转型期的“离职潮”为例[J]. 系统管理学报, 2016, 25(4): 691-704. 24 李政, 罗晖, 李正风, 等. 基于突变理论的科技评价方法初探[J]. 科研管理, 2017, 38(S1): 193-200. 25 Fu D Z, Li Y P, Huang G H. A fuzzy-Markov-chain-based analysis method for reservoir operation[J]. Stochastic Environmental Research and Risk Assessment, 2012, 26(3): 375-391. 26 秦旭, 李璟, 袁远, 等. 基于灰色马尔科夫模型的客户环境需求预测方法[J]. 机械设计与研究, 2018, 34(3): 134-139. 27 冯玉伯, 丁承君, 高雪, 等. 基于滑动平均与分段线性回归的时间序列相似性[J]. 计算机科学, 2018, 45(S1): 110-113. 28 田人合, 张志强, 高志. 基于分段线性回归模型的科学家个人科研产出规律研究——以杰青基金地球科学项目为例[J]. 图书情报工作, 2018, 62(1): 106-116. 29 Arslan O, Guralnik D P, Koditschek D E. Coordinated robot navigation via hierarchical clustering[J]. IEEE Transactions on Robotics, 2016, 32(2): 352-371. |
|
|
|