|
|
Research on Multi-source Data Fusion Method in Scientometrics |
Xu Haiyun1, 2, Dong Kun1, 3, Wei Ling1, 3, Wang Chao1, 3, Yue Zenghui4 |
1. Chengdu Library and Information Center, Chinese Academy of Sciences, Chengdu 610041; 2. Institute of Scientific and Technical Information of China (ISTIC), Beijing 100038; 3. University of Chinese Academy of Sciences, Beijing 100190; 4. School of Medical Information Engineering, Jining Medical University, Rizhao 276826 |
|
|
Abstract This paper systematically reviews the research and application of multi-source data fusion in Scientometrics. Multi-source data fusion in scientific metrology can be divided into early fusion, mid-term fusion, and post-fusion, and focuses on the relationship between multi-data type acquisition and the fusion of multiple data type relationships. Subsequently, the challenge of multi-source data fusion in scientific econometric analysis and the possible future breakthrough are put forward. Based on the fusion method in mathematics, the multi-source data fusion process and trend of development of future scientific econometric analysis are constructed.
|
Received: 01 November 2017
|
|
|
|
[1] 许海云, 方曙. 科学计量学的研究主题与发展——基于普赖斯奖得主的扩展作者共现分析[J]. 情报学报, 2013, 32(1): 58-67. [2] Morris S A, Van der Veer Martens B. Mapping research specialties[J]. Annual Review of Information Science And Technology, 2008, 42(1): 213-295. [3] Xu H Y, Yue Z H, Wang C, et al.Multi-source data fusion study in scientometrics[J]. Scientometrics, 2017, 111(2): 773-792. [4] Hai Y X, Chao W, Li J R, et al.Study of multi-source data fusion in topic discovery[M]. Singapore: Springer, 2016. [5] 化柏林. 情报学三动论探析: 序化论, 转化论与融合论[J]. 情报理论与实践, 2009, 32(11): 21-24, 41. [6] 化柏林. 多源信息融合方法研究[J]. 情报理论与实践, 2013, 36(11): 16-19. [7] 化柏林, 李广建. 大数据环境下多源信息融合的理论与应用探讨[J]. 图书情报工作, 2015, 59(16): 5-10. [8] 曲建升, 刘红煦. 知识发现中异构信息标准化处理研究——以资源环境领域文献为例[J]. 图书情报工作, 2016, 60(6): 84-90. [9] Small H.A general framework for creating large-scale maps of science in two or three dimensions: The SciViz system[J]. Scientometrics, 1998, 41(1-2): 125-133. [10] Glenisson P, Glänzel W, Janssens F, et al.Combining full text and bibliometric information in mapping scientific disciplines[J]. Information Processing & Management, 2005, 41(6): 1548-1572. [11] van Den Besselaar P, Heimeriks G. Mapping research topics using word-reference co-occurrences: A method and an exploratory case study[J]. Scientometrics, 2006, 68(3): 377-393. [12] Jo Y, Lagoze C, Giles C L.Detecting research topics via the correlation between graphs and texts[C]// Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2007: 370-379. [13] Shibata N, Kajikawa Y, Takeda Y, et al.Detecting emerging research fronts based on topological measures in citation networks of scientific publications[J]. Technovation, 2008, 28(11): 758-775. [14] Wen B, Horlings E, van der Zouwen M, et al. Mapping science through bibliometric triangulation: An experimental approach applied to water research[J]. Journal of the Association for Information Science and Technology, 2017, 68(3): 724-738. [15] Dong K, Xu H Y, Luo R, et al.An integrated method for interdisciplinary topic identification and prediction: a case study on information science and library science[J]. Scientometrics, 2018, 115(2): 849-868. [16] Strehl A, Ghosh J.Cluster ensembles—a knowledge reuse framework for combining multiple partitions[J]. Journal of Machine Learning Research, 2003, 3: 583-617. [17] Sun Y, Han J, Yan X, et al.Pathsim: Meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 992-1003. [18] Sebastian Y, Siew E G, Orimaye S O.Predicting future links between disjoint research areas using heterogeneous bibliographic information network[C]// Proceedings of the Pacific-Asia Conference on Knowledge Discovery and Data Mining. Cham: Springer, 2015: 610-621. [19] Jensen S, Liu X, Yu Y, et al.Generation of topic evolution trees from heterogeneous bibliographic networks[J]. Journal of Informetrics, 2016, 10(2): 606-621. [20] Leydesdorff L, Rafols I.Indicators of the interdisciplinarity of journals: Diversity, centrality, and citations[J]. Journal of Informetrics, 2011, 5(1): 87-100. [21] Latapy M, Magnien C, Del Vecchio N.Basic notions for the analysis of large two-mode networks[J]. Social Networks, 2008, 30(1): 31-48. [22] 高晨晖, 姜晓睿, 叶政君, 等. 基于异质学术超网的文献评价[J]. 情报学报, 2016, 35(8): 826-837. [23] Newman M E J. Who is the best connected scientist? A study of scientific coauthorship networks[M]// Ben-Naim E, Frauenfelder H, Toroczkai Z (eds). Complex Networks. Berlin: Springer, 2004: 337-370. [24] Morris S A, Yen G G.Construction of bipartite and unipartite weighted networks from collections of journal papers[J]. arXiv preprint physics/0503061, 2005. [25] Evans T S, Lambiotte R.Line graphs, link partitions, and overlapping communities[J]. Physical Review E, 2009, 80(1): 016105. [26] Batagelj V, Cerinšek M.On bibliographic networks[J]. Scientometrics, 2013, 96(3): 845-864. [27] Cerinšek M, Batagelj V.Network analysis of Zentralblatt MATH data[J]. Scientometrics, 2015, 102(1): 977-1001. [28] 周杰, 刘玉琴, 曾建勋. 学术研究主体与研究内容间的关联关系可视化方法[J]. 现代图书情报技术, 2012(11): 92-97. [29] Song J, Tang S, Liu X, et al.A modularity-based method reveals mixed modules from chemical-gene heterogeneous network[J]. PLoS ONE, 2015, 10(4): e0125585. [30] Raghavan U N, Albert R, Kumara S.Near linear time algorithm to detect community structures in large-scale networks[J]. Physical Review E, 2007, 76(3): 036106. [31] Newman M E J. Modularity and community structure in networks[J]. Proceedings of the National Academy of Sciences of the United States of America, 2006, 103(23): 8577-8582. [32] Barber M J, Modularity and community detection in bipartite networks[J]. Physical Review E, 2007, 76(2): 066102. [33] Comar P M, Tan P N, Jain A K.Simultaneous classification and community detection on heterogeneous network data[J]. Data Mining and Knowledge Discovery, 2012, 25(3): 420-449. [34] Liu X, Liu W C, Murata T, et al.A framework for community detection in heterogeneous multi-relational networks[J]. Advances in Complex Systems, 2014, 17(6): 1450018. [35] Nagurney A.Supernetworks: An introduction to the concept and its applications with a specific focus on knowledge supernetworks[D]. Amherst: University of Massachusetts Amherst, 2005. [36] 王众托. 关于超网络的一点思考[J]. 上海理工大学学报, 2011, 33(3): 229-237. [37] 胡枫. 复杂超网络的结构、建模及应用研究[D]. 西安: 陕西师范大学, 2014. [38] 縢立. 基于超网络的作者-机构-国家混合共现网络研究[J]. 情报学报, 2015, 34(1): 28-36. [39] Lange T, Buhmann J M.Fusion of similarity data in clustering[C]// Proceedings of the Conference on Advances in Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2005: 723-730. [40] Wang H, Huang H, Ding C, et al.Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization[J]. Journal of Computational Biology, 2013, 20(4): 344-358. [41] Nickel M, Tresp V, Kriegel H P.A three-way model for collective learning on multi-relational data[C]// Proceedings of the 28th International Conference on Machine Learning. Madison: Omnipress, 2011: 809-816. [42] Wang F, Li T, Zhang C S.Semi-supervised clustering via matrix factorization[C]// Proceedings of the SIAM International Conference on Data Mining, Atlanta, Georgia, USA, 2008: 1-12. [43] Wang H, Huang H, Ding C.Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix trifactorization[C]// Proceedings of the 20th ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2011: 279-284. [44] Liu Y, Wang T, Ji X S, et al. Detecting communities in 2-mode networks via fast nonnegative matrix trifactorization[J]. Mathematical Problems in Engineering, 2015, 2015: Article ID 937090. [45] Dunlavy D M, Kolda T G, Kegelmeyer W P.Multilinear algebra for analyzing data with multiple linkages[M]// Kepner J, Gilbert J (eds). Graph Algorithms in the Language of Linear Algebra, 2011: 85-114. [46] Snidaro L, García J, Llinas J.Context-based information fusion: a survey and discussion[J]. Information Fusion, 2015, 25: 16-31. [47] Xu Z, Zhao N.Information fusion for intuitionistic fuzzy decision making: an overview[J]. Information Fusion, 2016, 28: 10-23. [48] Leydesdorff L.What can heterogeneity add to the scientometric map? Steps towards algorithmic historiography[J]. arXiv preprint arXiv:10020532, 2010. [49] Morris S, DeYong C, Wu Z, et al. DIVA: a visualization system for exploring document databases for technology forecasting[J]. Computers & Industrial Engineering, 2002, 43(4): 841-862. [50] Morris S A, Yen G G.Crossmaps: Visualization of overlapping relationships in collections of journal papers[J]. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(suppl 1): 5291-5296. [51] Antal P, Glenisson P, Fannes G.On the potential of domain literature for clustering and Bayesian network learning[C]// Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2002: 405-414. [52] Antal P, Fannes G, Timmerman D, et al.Using literature and data to learn Bayesian networks as clinical models of ovarian tumors[J]. Artificial Intelligence in Medicine, 2004, 30(3): 257-281. [53] 庞鸿燊. 基于多重共现的知识发现方法研究[D]. 北京: 中国科学院大学, 2012. [54] 魏绪秋, 李长玲. 基于作者-年份-关键词网络的科研合作行为研究——以图书情报学为例[J]. 情报杂志, 2014, 33(11): 117-123. [55] Ghani S, Kwon B C, Lee S, et al.Visual analytics for multimodal social network analysis: A design study with social scientists[J]. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(12): 2032-2041. [56] van den Elzen S, van Wijk J J. Multivariate network exploration and presentation: From detail to overview via selections and aggregations[J]. IEEE Transactions on Visualization and Computer Graphics, 2014, 20(12): 2310-2319. [57] Xu S, Shi Q W, Qiao X D, et al.Author-Topic over Time (AToT): A dynamic users’ interest model[C]// Proceedings of the Conference on Mobile, Ubiquitous, and Intelligent Computing. Berlin: Springer, 2014, 274: 239-245. [58] 史庆伟, 乔晓东, 徐硕, 等. 作者主题演化模型及其在研究兴趣演化分析中的应用[J]. 情报学报, 2013, 32(9): 912-919. [59] Calero-Medina C, Noyons E C M. Combining mapping and citation network analysis for a better understanding of the scientific development: The case of the absorptive capacity field[J]. Journal of Informetrics, 2008, 2(4): 272-279. [60] He X F, Ding C H Q, Zha H Y, et al. Automatic topic identification using webpage clustering[C]// Proceedings of the 2001 IEEE International Conference on Data Mining. Washington DC: IEEE Computer Society, 2001: 195-202. [61] He X F, Zha H Y, Ding C H Q, et al. Web document clustering using hyperlink structures[J]. Computational Statistics & Data Analysis, 2002, 41(1): 19-45. [62] Wang Y T, Kitsuregawa M.Evaluating contents-link coupled web page clustering for web search results[C]// Proceedings of the Eleventh International Conference on Information and Knowledge Management. New York: ACM Press, 2002: 499-506. [63] Janssens F, Zhang L, De Moor B, et al.Hybrid clustering for validation and improvement of subject-classification schemes[J]. Information Processing & Management, 2009, 45(6): 683-702. [64] Janssens F, Glänzel W, De Moor B.Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis[C]// Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2007: 360-369. [65] Janssens F.Clustering of scientific fields by integrating text mining and bibliometrics[M]. Leuven: Katholieke Universiteit Leuven, 2007. [66] Zhang Y, Shang L, Huang L, et al.A hybrid similarity measure method for patent portfolio analysis[J]. Journal of Informetrics, 2016, 10(4): 1108-1130. [67] 郭红梅, 孔贝贝, 张智雄. 基于多重文本关系图中clique子团聚类的主题识别方法研究[J]. 情报学报, 2017, 36(5): 433-442. [68] Amjad T, Ding Y, Daud A, et al.Topic-based heterogeneous rank[J]. Scientometrics, 2015, 104(1): 313-334. [69] Du Y P, Yao C Q, Li N.Using heterogeneous patent network features to rank and discover influential inventors[J]. Frontiers of Information Technology & Electronic Engineering, 2015, 16(7): 568-578. [70] Morris S A, Yen G G.Construction of bipartite and unipartite weighted networks from collections of journal papers[J]. Physics, 2005. |
|
|
|