|
|
Application of SemRep and Burst Detection Algorithm in Bibliometric Analysis—A Case Study on the Development Trend of Drug Therapy |
Xu Shuang, Xu Dan, Han Shuang, Yang Ying |
Library of China Medical University, Shenyang 110122 |
|
|
Abstract Burst detection is a method of detecting frontiers in science by observing the development and changes in burst words, that is, words with a sharp increase in growth rate. SemRep extracts natural language semantic relationships based on Unified Medical Language System (UMLS). This paper reveals the research status and development trend of a field through SemRep combined with the burst detection algorithm, and analyzes the research focus and hotspots in SARS drug therapy research. In the context of the COVID-19 outbreak, this study provides a strong lead for the selection and development of drugs for SARS-CoV-2 prevention and control. Using the dataset on drug therapy of SARS, the SemRep and SemRep semantic processing system were used to extract the drug terminology concept sets with “TREAT” relationship according to the UMLS semantic relationship. Fifty-one effective concepts such as “Ribavirin” were obtained after duplicates were removed. These drugs were routine medications for SARS. They were mainly used for clinical first aid in the event of an outbreak. According to the Kleinberg’s burst detection algorithm, the burst weight index of the drug concepts was calculated. The potential drugs for SARS were obtained by sorting the concepts according to the burst weight index. Most of these drugs were studied in antiviral laboratories after the outbreak was curbed. The method of SemRep combined with burst detection is not only applicable to the field of drug therapy for diseases but also in identifying research hotspots in various disciplines.
|
Received: 14 May 2020
|
|
|
|
1 Kleinberg J. Bursty and hierarchical structure in streams[C]// Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2002: 91-101. 2 徐爽. 基于突发监测的全身炎症反应综合征治疗药物研究趋势分析[D]. 沈阳: 中国医科大学, 2010. 3 王梦婷. 基于突变检测的主题突变分析研究[J]. 情报科学, 2016, 34(12): 36-39. 4 王孝宁, 崔雷, 刘刚, 等. 突发监测算法用于共词聚类分析的尝试[J]. 图书情报工作, 2009, 53(12): 104-107, 120. 5 张正宇. 医学院在校生对HPV知信行及健康信息精准服务研究[D]. 重庆: 重庆医科大学, 2019. 6 Kleinberg J. Bursty and hierarchical structure in streams[J]. Data Mining and Knowledge Discovery, 2003, 7: 373-397. 7 胡静, 李璐. 基于词频突变的我国阅读推广研究前沿挖掘[J]. 情报科学, 2017, 35(10): 75-78. 8 郑乐丹. 基于突变检测的学科领域新兴研究趋势探测分析[J]. 情报杂志, 2012, 31(9): 50-53. 9 Kontostathis A, Galitsky L M, Pottenger W M, et al. A survey of emerging trend detection in textual data mining[J]. Survey of Text Mining, 2004, 13: 185-224. 10 Mane K K, B?rner K. Mapping topics and topic bursts in PNAS[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101: 5287-5290. 11 Ke W M, B?rner K, Viswanath L. Major information visualization authors, papers and topics in the ACM library[C]// Proceedings of the IEEE Symposium on Information Visualization. IEEE, 2004: r1. 12 Chen C M. Searching for intellectual turning points: progressive knowledge domain visualization[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(Suppl 1): 5303-5310. 13 Chen C M. CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature[J]. Journal of the American Society for Information Science and Technology, 2006, 57(3): 359-377. 14 杨选辉, 杜心雨, 蔡志强. 基于突变检测与共词分析的深阅读新兴趋势分析[J]. 图书馆建设, 2018(5): 48-53. 15 杨选辉, 蔡志强. 基于突变检测与共词分析的关联数据新兴趋势探测[J]. 情报科学, 2018, 36(11): 164-168. 16 尚晓倩. 基于突变检测的国际Altmetrics研究热点和趋势分析[J]. 情报科学, 2017, 35(5): 51-56. 17 郑乐丹. 基于突发检测的我国数字图书馆研究前沿及其演进分析[J]. 图书馆论坛, 2013, 33(1): 47-51. 18 Zhou A Y, Qin S K, Qian W N. Adaptively detecting aggregation bursts in data streams[C]// Proceedings of the International Conference on Database Systems for Advanced Applications. Heidelberg: Springer, 2005, 3453: 435-446. 19 Chen T T, Wang Y, Fang B X, et al. Detecting lasting and abrupt bursts in data streams using two-layered wavelet tree[C]// Proceedings of the Advanced International Conference on Telecommunications and International Conference on Internet and Web Applications and Services. IEEE, 2006: 30. 20 李勇, 安新颖, 赵迎光. 基于动态时间窗口的突发监测研究[J]. 医学信息学杂志, 2014, 35(6): 44-48. 21 李秀霞, 胡凡刚, 袁林, 等. 基于加权中值相关和半阈值策略的突发关键词监测[J]. 情报理论与实践, 2015, 38(3): 53-58. 22 Fung G P C, Yu H X, Yu P S, et al. Parameter free bursty events detection in text streams[C]// Proceedings of the 31st International Conference on Very Large Data Bases. VLDB Endowment, 2005: 181-192. 23 He Q, Chang K Y, Lim E P. Analyzing feature trajectories for event detection[C]// Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM Press, 2007: 207-214. 24 Lappas T, Arai B, Platakis M, et al. On burstiness-aware search for document sequences[C]// Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2009: 477-486. 25 张晗, 赵玉虹. 基于语义图的医学多文档摘要提取模型构建[J]. 图书情报工作, 2017, 61(8): 112-119. 26 高红梅, 魏西峰, 王崧华, 等. 语义词库关联的藏文Web语义检索系统研究与实现[J]. 西藏大学学报(自然科学版), 2015, 30(2): 90-95. 27 逯万辉, 马建霞, 赵迎光. 爆发词识别与主题探测技术研究综述[J]. 情报理论与实践, 2012, 35(6): 125-128. 28 姬东鸿. 语义分析若干前沿问题[J]. 长江学术, 2020(2): 99-114. 29 刘佳宇, 韦尧, 周丹丹. 基于情感语义分析的舆情监测技术探讨[C]// 中国新闻技术工作者联合会2018年学术年会论文集(学术论文篇). 北京: 中国新闻技术工作者联合会, 2018: 206-210. 30 王怀波, 李冀红, 孙洪涛, 等. 基于模型的教育大数据应用框架设计[J]. 现代教育技术, 2020, 30(6): 5-12. 31 闫雷, 刘春鹤, 关晶, 等. SemRep处理结果统计挖掘系统的开发[J]. 医学信息学杂志, 2013, 34(4): 31-34. 32 宋文. 统一医学语言系统及其应用[J]. 情报理论与实践, 2005, 28(5): 518-522. 33 张晗, 赵玉虹. 医学文献语义共词知识网的构建: 方法与实证[J]. 图书情报工作, 2016, 60(11): 135-142. 34 宋鑫智, 崔雷. 利用SemRep语义网及MeSH语义网表达单篇论文知识[J]. 中华医学图书情报杂志, 2019, 28(1): 1-7. 35 Rindflesch T C, Fiszman M. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text[J]. Journal of Biomedical Informatics, 2003, 36(6): 462-477. 36 庞弘燊. 基于科技论文多特征项共现突发强度分析方法的算法实现与可视化图谱研究[J]. 图书情报工作, 2015, 59(24): 115-122. 37 魏晓俊. 基于科技文献中词语的科技发展监测方法研究[J]. 情报杂志, 2007(3): 35-38. 38 童元元, 何巍, 杨策, 等. 国际植物药研究科技论文的计量分析[J]. 中华医学图书情报杂志, 2012(7): 55-60. 39 Fiszman M, Rindflesch T C, Kilicoglu H. Integrating a hypernymic proposition interpreter into a semantic processor for biomedical texts[J]. AMIA Annual Symposium Proceedings Archive, 2003, 2003: 239-243. 40 北京最后两名病人今出院, 中国医院再无非典患者[EB/OL]. (2003-08-16) [2020-02-15]. http://www.chinanews.com/n/2003-08-16/26/335859.html. 41 Aronson A R. MetaMap: mapping text to the UMLS metathesaurus[EB/OL]. (2006-07-14) [2020-02-15]. http://skr.nlm.nih.gov/papers/references/metamap06.pdf. 42 丁云轩, 闫雷. 数据挖掘软件SemRepr的评价[J]. 中华医学图书情报杂志, 2008, 17(6): 71-75. 43 Tai D Y H. Pharmacologic treatment of SARS: current knowledge and recommendations[J]. Annals of the Academy of Medicine, Singapore, 2007, 36(6): 438-443. 44 Kawana A. Clinical and epidemiological review of SARS[J]. Annals of the Academy of Medicine, Singapore, 2007, 36(6): 438-443. 45 Lau E M C, Chan F W K, Hui D S C, et al. Reduced bone mineral density in male severe acute respiratory syndrome (SARS) patients in Hong Kong[J]. Bone, 2005, 37(3): 420-424. 46 Zhao F C, Guo K J, Li Z R. Osteonecrosis of the femoral head in SARS patients: seven years later[J]. European Journal of Orthopaedic Surgery & Traumatology, 2013, 23: 671-677. 47 李玉明, 王世鑫, 高宏生, 等. 严重急性呼吸综合征患者康复期股骨头缺血性坏死和骨质疏松的影响因素[J]. 中华医学杂志, 2004, 84(16): 1348-1353. 48 Barnard D L, Day C W, Bailey K, et al. Evaluation of immunomodulators, interferons and known in vitro SARS-coV inhibitors for inhibition of SARS-coV replication in BALB/c mice[J]. Antiviral Chemistry and Chemotherapy, 2006, 17(5): 275-284. 49 Chu C M, Cheng V C C, Hung I F N, et al. Role of lopinavir/ritonavir in the treatment of SARS: initial virological and clinical findings[J]. Thorax, 2004, 59(3): 252-256. 50 Liu X M, Zhang M M, He L, et al. Chinese herbs combined with Western medicine for severe acute respiratory syndrome (SARS)[J]. The Cochrane Database of Systematic Reviews, 2012, 10(10): CD004882. 51 Zhang X S, Alekseev K, Jung K, et al. Cytokine responses in porcine respiratory coronavirus-infected pigs treated with corticosteroids as a model for severe acute respiratory syndrome[J]. journal of Virology, 2008, 82(9): 4420-4428. 52 郝东, 何礼贤, 瞿介明, 等. SARS冠状病毒N蛋白致大鼠肺部炎症及糖皮质激素对其的作用[J]. 中华内科杂志, 2005, 44(12): 890-893. 53 Simmons G, Gosalia D N, Rennekamp A J, et al. Inhibitors of cathepsin L prevent severe acute respiratory syndrome coronavirus entry[J]. Proceedings of the National Academy of Sciences of the United States of America, 2005, 102(33): 11876-11881. 54 Du Q S, Sun H, Chou K C. Inhibitor design for SARS coronavirus main protease based on “distorted key theory”[J]. Medicinal Chemistry, 2007, 3(1): 1-6. 55 Barnard D L, Day C W, Bailey K, et al. Enhancement of the infectivity of SARS-CoV in BALB/c mice by IMP dehydrogenase inhibitors, including ribavirin[J]. Antiviral Research, 2006, 71(1): 53-63. 56 方丽, 崔雷. 利用双聚类和突发监测算法探测学科前沿及知识基础的比较分析——以h指数研究领域为例[J]. 情报杂志, 2015, 34(2): 79-83, 88. 57 田雅婷. 《新型冠状病毒感染的肺炎诊疗方案》不断更新[N]. 光明日报, 2020-02-06(002). 58 关于印发新型冠状病毒感染的肺炎诊疗方案(试行第四版)的通知[EB/OL]. (2020-01-27) [2020-02-17]. http://www.gov.cn/zhengce/zhengceku/2020-01/28/content_5472673.htm. 59 关于印发新型冠状病毒肺炎诊疗方案(试行第五版 修正版)的通知[EB/OL]. (2020-02-08) [2020-02-15]. http://www.nhc.gov.cn/yzygj/s7653p/202002/d4b895337e19445f8d728fcaf1e3e13a.shtml. 60 关于印发新型冠状病毒肺炎诊疗方案(试行第六版)的通知[EB/OL]. (2020-02-19) [2020-02-20]. http://www.nhc.gov.cn/xcs/zhengcwj/202002/8334a8326dd94d329df351d7da8aefc2.shtml. 61 Zhou P, Yang X L, Wang X G, et al. A pneumonia outbreak associated with a new coronavirus of probable bat origin[J]. Nature, 2020, 579(2): 270-273. 62 万众一心、攻克难关——新华社再访钟南山院士谈科学防控新型冠状病毒感染的肺炎疫情[EB/OL]. (2020-02-03) [2020-2-15]. http://www.gov.cn/xinwen/2020-02/03/content_5474113.htm. 63 方丽, 赵悦阳, 崔雷. 利用突发监测算法探测学科前沿及知识基础[J]. 医学信息学杂志, 2014(10): 49-54. |
|
|
|