|
|
SciBERT-based Measurement of Knowledge Intersection in Scientific Collaboration and Its Causal Effects on Sustained Research Output |
Feng Xiaodong, Huang Yuhang |
School of Information Management, Sun Yat-sen University, Guangzhou 510006 |
|
|
Abstract With the increasing complexity and integration of scientific research problems, scientific collaboration and interdisciplinary research have intensified. Knowledge intersection represents a process of interaction, integration, and innovation across different fields, which can lead to significant scientific achievements. Breaking through the perspective of disciplinary granularity and static presentation in existing interdisciplinary research, this study explores the measurement of fine-grained knowledge intersection between research subjects and collaborators with respect to the accumulation of dynamic knowledge, experience, and continuous scientific research output. We constructed a measurement model to quantify the intensity of knowledge intersection in research collaborations using text-mining approaches based on a bidirectional encoder representations from transformers (BERT) model trained on scientific text (SciBERT) to represent the knowledge concept or sentence in a manuscript. Using scientific publication data from the Information Systems discipline as a case study, we generated panel data and applied the Generalized Propensity Score Matching method to conduct a causal effect analysis. The differential impacts of knowledge intersection on research questions and methodology were further explored. These findings indicate that the overall knowledge intersection in research collaboration has an inverted U-shaped effect on subsequent research output. Notably, knowledge intersection in research methodologies exerts a more substantial influence on the continuous research outputs of subjects than research questions.
|
Received: 23 October 2024
|
|
|
|
1 Abramo G, D'Angelo C A. How do you define and measure research productivity?[J]. Scientometrics, 2014, 101(2): 1129-1144. 2 赵君, 廖建桥. 科研合作研究综述[J]. 科学管理研究, 2013, 31(2): 117-120. 3 D'Este P, Llopis O, Rentocchini F, et al. The relationship between interdisciplinarity and distinct modes of university-industry interaction[J]. Research Policy, 2019, 48(9): 103799. 4 唐旭丽, 李信. 科研团队多样性对学术颠覆性创新的影响研究——以人工智能领域为例[J]. 情报学报, 2023, 42(1): 43-58. 5 Lyu D Q, Gong K L, Ruan X M, et al. Does research collaboration influence the “disruption” of articles? Evidence from neurosciences[J]. Scientometrics, 2021, 126(1): 287-303. 6 Hu Z G, Chen C M, Liu Z Y. How are collaboration and productivity correlated at various career stages of scientists?[J]. Scientometrics, 2014, 101(2): 1553-1564. 7 Rafols I, Meyer M. Diversity and network coherence as indicators of interdisciplinarity: case studies in bionanoscience[J]. Scientometrics, 2010, 82(2): 263-287. 8 吕琦, 上官燕红, 张琳, 等. 基于文本内容自动分类的跨学科测度研究[J]. 数据分析与知识发现, 2023, 7(4): 56-67. 9 刘嘉明, 孙建军. 参考文献跨学科性与论文学术影响力的关系研究[J]. 情报学报, 2023, 42(5): 525-536. 10 丁乐蓉, 杨欣谊, 张靖雯. 不同合作模式下论文跨学科性对颠覆性创新的影响研究[J]. 现代情报, 2025, 45(2): 133-144. 11 Liu X, Yi B, Li M, et al. Is interdisciplinary collaboration research more disruptive than monodisciplinary research?[J]. Proceedings of the Association for Information Science and Technology, 2021, 58(1): 264-272. 12 姜华, 刘苗苗. 学科交叉的知识流动特征及影响探究——以经济学学科为例[J]. 学位与研究生教育, 2023(2): 26-33. 13 滕立, 汪新华, 郝韦霞. 基于耦合关系的学科知识交叉建构研究[J]. 情报科学, 2022, 40(10): 20-25. 14 杨洁, 王曰芬, 陈必坤, 等. 基金项目学部分部的交叉网络分析——以美国NSF数据中AI领域为例[J]. 情报学报, 2022, 41(9): 945-955. 15 岳名亮, 马廷灿, 王桂芳, 等. 跨领域合作对科研产出的影响: 以国家自然科学基金资助的SCI论文为例[J]. 中国科学基金, 2016, 30(6): 551-555. 16 Small H. Co-citation in the scientific literature: a new measure of the relationship between two documents[J]. Journal of the American Society for Information Science, 1973, 24(4): 265-269. 17 邱均平, 曹洁. 不同学科间知识扩散规律研究——以图书情报学为例[J]. 情报理论与实践, 2012, 35(10): 1-5. 18 刘浏, 王东波. 基于论文自动分类的社科类学科跨学科性研究[J]. 数据分析与知识发现, 2018, 2(3): 30-38. 19 Gibbons M, Limoges C, Scott P, et al. The new production of knowledge: the dynamics of science and research in contemporary societies[M]. London : SAGE Publications, 1994. 20 陈艾华, 吴伟. 大学跨学科科研合作与科研生产力的关系研究综述与展望[J]. 重庆高教研究, 2023, 11(5): 105-116. 21 陈艾华, 邹晓东, 陈勇, 等. 美国研究型大学跨学科研究的实践创新——以威斯康星大学麦迪逊分校CHI为例[J]. 高等工程教育研究, 2010(1): 117-120, 163. 22 尤莉. 大学跨学科团队知识异质性与创新绩效关系的实证研究[J]. 国家教育行政学院学报, 2017(3): 62-69. 23 van Knippenberg D, Schippers M C. Work group diversity[J]. Annual Review of Psychology, 2007, 58: 515-541. 24 Li Q, She Z L, Yang B Y. Promoting innovative performance in multidisciplinary teams: the roles of paradoxical leadership and team perspective taking[J]. Frontiers in Psychology, 2018, 9: 1083. 25 Chen A H, Wang X T. The effect of facilitating interdisciplinary cooperation on the research productivity of university research teams: the moderating role of government assistance[J]. Research Evaluation, 2021, 30(1): 13-25. 26 Yoo H S, Jung Y L, Lee J Y, et al. The interaction of inter-organizational diversity and team size, and the scientific impact of papers[J]. Information Processing & Management, 2024, 61(6): 103851. 27 Vieira E S. The influence of research collaboration on citation impact: the countries in the European Innovation Scoreboard[J]. Scientometrics, 2023, 128(6): 3555-3579. 28 柳美君, 步一, 杨斯杰. 科研团队成员国别差异性的测度、演变及其与团队产出影响力的关系[J]. 情报学报, 2024, 43(7): 818-838. 29 Li E Y, Liao C H, Yen H R. Co-authorship networks and research impact: a social capital perspective[J]. Research Policy, 2013, 42(9): 1515-1530. 30 谭春辉, 郭洋, 王仪雯, 等. 基于博弈的生命周期视角下虚拟学术社区科研合作行为选择研究[J]. 现代情报, 2020, 40(5): 51-57, 87. 31 石静, 孙建军. 科技创新团队的知识网络构建与知识测度研究[J]. 情报学报, 2022, 41(9): 900-914. 32 Jones B F. The burden of knowledge and the “death of the renaissance man”: is innovation getting harder?[J]. The Review of Economic Studies, 2009, 76(1): 283-317. 33 赵毅, 章成志, 习海旭. 影响不同子领域国际合作的距离因素相同吗?——来自计算机科学学科的证据[J]. 情报学报, 2023, 42(12): 1458-1476. 34 Luo Z R, Lu W, He J G, et al. Combination of research questions and methods: a new measurement of scientific novelty[J]. Journal of Informetrics, 2022, 16(2): 101282. 35 Davis G B, Olson M H. Management information systems: conceptual foundations, structure, and development[M]. New York: McGraw-Hill, 1984. 36 蒋玲, 杨溢, 刁剑. 信息管理与信息系统专业定位与培养方向探析[J]. 情报理论与实践, 2009, 32(4): 41-43, 50. 37 冯小东, 罗简凡. 早期引文扩散深广度对论文后期扩散强度的影响——基于个体引文网络的实证研究[J]. 现代情报, 2025, 45(5): 126-138. 38 Heffernan K, Teufel S. Identifying problems and solutions in scientific text[J]. Scientometrics, 2018, 116(2): 1367-1382. 39 张颖怡, 章成志, HeDaqing. 学术论文中问题与方法识别及其关系抽取研究综述[J]. 图书情报工作, 2022, 66(12): 125-138. 40 张颖怡, 章成志. 基于公式化表达脱敏与边界识别加强的学术论文研究问题与方法识别研究[J]. 情报学报, 2024, 43(6): 712-732. 41 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 42 Beltagy I, Lo K, Cohan A. SciBERT: a pretrained language model for scientific text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2019: 3615-3620. 43 张汝昊, 张超, 陈光, 等. 跨国流动对科研人员论文产出的影响——基于大规模个体流动特征的研究[J]. 科学学与科学技术管理, 2023, 44(12): 156-173. 44 Hirano K, Imbens G W. The propensity score with continuous treatments[M]// Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives. Hoboken: John Wiley & Sons, 2004. 45 翟羽佳, 周睿, 李岩, 等. 科研人员跨学科性与个体学术影响力的因果效应分析[J]. 数据分析与知识发现, 2023, 7(11): 140-157. 46 Liu Q L, Guo L, Sun Y P, et al. Do scholars' collaborative tendencies impact the quality of their publications? A generalized propensity score matching analysis[J]. Journal of Informetrics, 2024, 18(1): 101487. 47 Guardabascio B, Ventura M. Estimating the dose–response function through a generalized linear model approach[J]. The Stata Journal, 2014, 14(1): 141-158. 48 Larivière V, Gingras Y. On the relationship between interdisciplinarity and scientific impact[J]. Journal of the American Society for Information Science and Technology, 2010, 61(1): 126-131. 49 Chen T Q, Guestrin C. XGBoost: a scalable tree boosting system[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2016: 785-794. 50 刘雪立, 赵俊玲. 图书情报学期刊和论文的跨学科强度与其学术影响力的关系[J]. 中国科技期刊研究, 2021, 32(3): 411-417. 51 贺灿飞, 李文韬. 中国国际科研合作网络的时空演化特征与驱动力[J]. 中国软科学, 2022(7): 70-81. 52 Nooteboom B, Van Haverbeke W, Duysters G, et al. Optimal cognitive distance and absorptive capacity[J]. Research Policy, 2007, 36(7): 1016-1034. 53 岳婷, 杨立英, 沈哲思. 科研活动中的倒U形现象研究初探[J]. 科学学研究, 2025, 43(5): 996-1003. 54 Larivière V, Haustein S, B?rner K. Long-distance interdisciplinarity leads to higher scientific impact[J]. PLoS One, 2015, 10(3): e0122565. 55 Adams J S. Inequity in social exchange[J]. Advances in Experimental Social Psychology, 1965, 2: 267-299. 56 Kuhn T S. The structure of scientific revolutions[M]. Chicago: University of Chicago Press, 1997. 57 Pinch T. The culture of scientists and disciplinary rhetoric[J]. European Journal of Education, 1990, 25(3): 295-304. |
|
|
|