Chasing the Cause and Tracing the Effect: Application and Prospect of Causal Inference Methods in the Field of Information Resource Management
Wu Jiang1, Tao Chengxu1, Ou Guiyan1, Ding Yang2,3,4
1.School of Information Management, Wuhan University, Wuhan 430072 2.Business School, The University of Edinburgh, Edinburgh EH8 9JS 3.Edinburgh Futures Institute, The University of Edinburgh, Edinburgh EH3 9EF 4.Lab of Interdisciplinary Spatial Analysis, University of Cambridge, Cambridge CB3 9EP
吴江, 陶成煦, 欧桂燕, 丁杨. 追因溯果:信息资源管理领域因果推断方法的应用与展望[J]. 情报学报, 2025, 44(7): 915-932.
Wu Jiang, Tao Chengxu, Ou Guiyan, Ding Yang. Chasing the Cause and Tracing the Effect: Application and Prospect of Causal Inference Methods in the Field of Information Resource Management. 情报学报, 2025, 44(7): 915-932.
1 朝乐门. 信息资源管理理论的继承与创新: 大数据与数据科学视角[J]. 中国图书馆学报, 2019, 45(2): 26-42. 2 Barnard G A. Statistical inference[J]. Journal of the Royal Statistical Society: Series B (Methodological), 1949, 11(2): 115-149. 3 Rubin D B. Estimating causal effects of treatments in randomized and nonrandomized studies[J]. Journal of Educational Psychology, 1974, 66(5): 688-701. 4 Pearl J. Causality[M]. Cambridge: Cambridge University Press, 2009. 5 苗旺, 刘春辰, 耿直. 因果推断的统计方法[J]. 中国科学: 数学, 2018, 48(12): 1753-1778. 6 Fan L X, Guo L, Wang X H, et al. Does the author’s collaboration mode lead to papers’ different citation impacts?An empirical analysis based on propensity score matching[J]. Journal of Informetrics, 2022, 16(4): 101350. 7 Mutz R, Wolbring T, Daniel H D. The effect of the “very important paper” (VIP) designation in Angewandte Chemie International Edition on citation impact: a propensity score matching analysis[J]. Journal of the Association for Information Science and Technology, 2017, 68(9): 2139-2153. 8 Li W, Xiong B Q, Yang C F. A roadmap to achieving a healthier information ecosystem through GDPR implementation and privacy compliance technologies[J]. Journal of the Association for Information Science and Technology, 2024, 75(10): 1182-1201. 9 王筱纶, 赵宇翔, 王曰芬. 倾向得分匹配法: 促进数据科学视角下情报学研究的因果推断[J]. 情报学报, 2020, 39(11): 1191-1203. 10 Dong X L, Xu J H, Bu Y, et al. Beyond correlation: towards matching strategy for causal inference in Information Science[J]. Journal of Information Science, 2022, 48(6): 735-748. 11 凌继尧. 亚理斯多德的美学思想和四因说[J]. 社会科学研究, 2000(1): 94-99. 12 王天思. 大数据中的因果关系及其哲学内涵[J]. 中国社会科学, 2016(5): 22-42, 204-205. 13 Hume D. A treatise of human nature[M]. Oxford: Clarendon Press, 1896. 14 康德. 纯粹理性批判[M]. 邓晓芒, 译. 北京: 人民出版社, 2004: 440. 15 李文钊. 因果推理中的潜在结果模型: 起源、逻辑与意蕴[J]. 公共行政评论, 2018, 11(1): 124-149, 221-222. 16 李文钊, 徐文. 基于因果推理的政策评估: 一个实验与准实验设计的统一框架[J]. 管理世界, 2022, 38(12): 104-123. 17 Lewis D. Causation[J]. The Journal of Philosophy, 1973, 70(17): 556-567. 18 Wright S. Correlation and causation[J]. Journal of Agricultural Research, 1921, 20(7): 557-585. 19 Splawa-Neyman J. On the application of probability theory to agricultural experiments. Essay on principles[J]. Annals of Agricultural Sciences, 1923: 1-51. 20 Pearl J. Causal diagrams for empirical research[J]. Biometrika, 1995, 82(4): 669-688. 21 Spirtes P, Glymour C, Scheines R. Causation, prediction, and search[M]. 2nd ed. Cambridge: The MIT Press, 2001. 22 Sch?lkopf B, Locatello F, Bauer S, et al. Toward causal representation learning[J]. Proceedings of the IEEE, 2021, 109(5): 612-634. 23 Fang Y X, Liang F M. Causal-StoNet: causal inference for high-dimensional complex data[OL]. (2024-03-27). https://arxiv.org/pdf/2403.18994. 24 Zhao K S, Zhang L. Causality-inspired spatial-temporal explanations for dynamic graph neural networks[C]// Proceedings of the Twelfth International Conference on Learning Representations. Appleton: ICLR, 2024: 1-13. 25 Li H X, Zheng C Y, Xiao Y H, et al. Debiased collaborative filtering with kernel-based causal balancing[OL]. (2024-04-30). https://arxiv.org/pdf/2404.19596. 26 Greenland S, Pearl J, Robins J M. Confounding and collapsibility in causal inference[J]. Statistical Science, 1999, 14(1): 29-46. 27 Pearl J, Mackenzie D. The book of why: the new science of cause effect[M]. New York: Basic Books Publishing, 2018. 28 Shaver J M. Causal identification through a cumulative body of research in the study of strategy and organizations[J]. Journal of Management, 2020, 46(7): 1244-1256. 29 Geng Z, Liu Y, Liu C C, et al. Evaluation of causal effects and local structure learning of causal networks[J]. Annual Review of Statistics and Its Application, 2019, 6(1): 103-124. 30 Adib R. Causal inference in healthcare: approaches to causal modeling and reasoning through graphical causal models[D]. Milwaukee: Marquette University, 2022. 31 Cornelissen J, Kaandorp M. Towards stronger causal claims in management research: causal triangulation instead of causal identification[J]. Journal of Management Studies, 2023, 60(4): 834-860. 32 刘丛, 陈婷, 薄诗雨. 电力扩张对工业发展影响的因果评估——来自近代中国的证据[J]. 经济学(季刊), 2024, 24(1): 189-204. 33 Bind M A. Causal modeling in environmental health[J]. Annual Review of Public Health, 2019, 40: 23-43. 34 Morgan S L, Winship C. Counterfactuals and causal inference: methods and principles for social research[M]. 2nd ed. Cambridge: Cambridge University Press, 2014. 35 Hernán M A, Robins J M. Causal inference: what if[M]. Boca Raton: Chapman & Hall/CRC, 2020. 36 Liu Q L, Guo L, Sun Y P, et al. Do scholars' collaborative tendencies impact the quality of their publications? A generalized propensity score matching analysis[J]. Journal of Informetrics, 2024, 18(1): 101487. 37 宋小康, 赵宇翔, 朱庆华. 在线健康信息替代搜寻对被替代者健康行为和健康水平的影响研究[J]. 情报学报, 2022, 41(6): 625-636. 38 李胜连, 杨建永, 张丽颖. 乡村振兴背景下农户可行信息能力组成要素、评价与对策建议[J]. 情报科学, 2023, 41(7): 134-145. 39 乐承毅, 孔维伟, 段楠楠. IP属地化政策对舆情评论的影响研究——基于微博用户属地公开的准自然实验[J]. 图书情报知识, 2024, 41(1): 46-57. 40 Liu X X, Hu M Y, Xiao B S, et al. Is my doctor around me? Investigating the impact of doctors’ presence on patients’ review behaviors on an online health platform[J]. Journal of the Association for Information Science and Technology, 2022, 73(9): 1279-1296. 41 赵宇翔, 刘周颖, 宋士杰. 从免费到付费: 在线知识问答平台用户标识对回答者转移行为的影响[J]. 图书与情报, 2019(2): 16-28. 42 陈玲, 段尧清. 政务大数据政策的技术创新效应分析——基于PSM-DID方法的估计[J]. 图书情报工作, 2020, 64(20): 96-105. 43 魏勇. 现代公共文化服务体系建设对居民文化参与的影响[J]. 图书馆论坛, 2024, 44(1): 24-35. 44 盛小平, 吴瑾. 我国大数据政策对区域创新能力影响的实证研究[J]. 现代情报, 2024, 44(12): 89-101. 45 Zong Q J, Huang Z H, Huang J R. Can open access increase LIS research’s policy impact? Using regression analysis and causal inference[J]. Scientometrics, 2023, 128(8): 4825-4854. 46 邓洁, 张彩铃, 李源信. 高校专利质量对专利转化的影响效应研究——基于“双一流”高校的实证研究[J]. 情报杂志, 2021, 40(10): 200-207. 47 张克群, 项星星, 张婷, 等. 识别高被引专利——基于稀有事件Logit与倾向得分匹配模型[J]. 图书馆论坛, 2021, 41(6): 67-74. 48 Deaton A, Cartwright N. Understanding and misunderstanding randomized controlled trials[J]. Social Science & Medicine, 2018, 210: 2-21. 49 Duflo E, Glennerster R, Kremer M. Using randomization in development economics research: a toolkit[J]. Handbook of Development Economics, 2007, 4: 3895-3962. 50 Qin C X, Liu Y X, Ma X B, et al. Designing for serendipity in online knowledge communities: an investigation of tag presentation formats and openness to experience[J]. Journal of the Association for Information Science and Technology, 2022, 73(10): 1401-1417. 51 Yao L Y, Chu Z X, Li S, et al. A survey on causal inference[J]. ACM Transactions on Knowledge Discovery from Data, 2021, 15(5): Article No.74. 52 Stuart E A. Matching methods for causal inference: a review and a look forward[J]. Statistical Science, 2010, 25(1): 1-21. 53 Austin P C. An introduction to propensity score methods for reducing the effects of confounding in observational studies[J]. Multivariate Behavioral Research, 2011, 46(3): 399-424. 54 Bertrand M, Duflo E, Mullainathan S. How much should we trust differences-in-differences estimates?[J]. The Quarterly Journal of Economics, 2004, 119(1): 249-275. 55 Heckman J J, Ichimura H, Todd P. Matching as an econometric evaluation estimator[J]. The Review of Economic Studies, 1998, 65(2): 261-294. 56 Abadie A, Gardeazabal J. The economic costs of conflict: a case study of the Basque country[J]. American Economic Review, 2003, 93(1): 113-132. 57 周娜. 地方性专门法规与公共图书馆经费保障——基于合成控制法的分析[J]. 图书馆论坛, 2019, 39(11): 148-154. 58 Abadie A, Diamond A, Hainmueller J. Synthetic control methods for comparative case studies: estimating the effect of California’s tobacco control program[J]. Journal of the American Statistical Association, 2010, 105(490): 493-505. 59 Abadie A, Diamond A, Hainmueller J. Comparative politics and the synthetic control method[J]. American Journal of Political Science, 2015, 59(2): 495-510. 60 Sajons G B. Estimating the causal effect of measured endogenous variables: a tutorial on experimentally randomized instrumental variables[J]. The Leadership Quarterly, 2020, 31(5): 101348. 61 Oh J, Kang J H. Converting a digital minority into a digital beneficiary: digital skills to improve the need for cognition among Korean older adults[J]. Information Development, 2021, 37(1): 21-31. 62 Bound J, Jaeger D A, Baker R M. Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak[J]. Journal of the American Statistical Association, 1995, 90(430): 443-450. 63 Hahn J, Hausman J. Weak instruments: diagnosis and cures in empirical econometrics[J]. American Economic Review, 2003, 93(2): 118-125. 64 Trochim W M K. Research design for program evaluation: the regression-discontinuity approach[J]. Journal of the American Statistical Association, 1986, 81(395): 871-872. 65 Imbens G W, Lemieux T. Regression discontinuity designs: a guide to practice[J]. Journal of Econometrics, 2008, 142(2): 615-635. 66 Kim J M, Han J, Jiang S Y. The impact of comment history disclosure on online comment posting behaviors[J]. Information Technology & People, 2023, 36(7): 2847-2868. 67 Lee D S, Lemieux T. Regression discontinuity designs in economics[J]. Journal of Economic Literature, 2010, 48(2): 281-355. 68 Kim S, Park K C. Government funded R&D collaboration and it’s impact on SME’s business performance[J]. Journal of Informetrics, 2021, 15(3): 101197. 69 Ma X, Huang T. Proximity still matters in research collaboration! Evidence from the introduction of new airline routes and high-speed railways in China[J]. Scientometrics, 2024, 129(4): 2227-2253. 70 Feng L, Yuan H, Ye Q W, et al. Exploring the impacts of a recommendation system on an e-platform based on consumers’ online behavioral data[J]. Information & Management, 2024, 61(2): 103905. 71 Wine L D, Pribesh S, Kimmel S C, et al. Impact of school librarians on elementary student achievement in reading and mathematics: a propensity score analysis[J]. Library & Information Science Research, 2023, 45(3): 101252. 72 王涛, 石丹. 示范区创建对公共文化服务均等化的示范引领作用[J]. 图书馆论坛, 2023, 43(4): 79-88. 73 李广威, 刘书雷, 黄朝峰, 等. 国防专利解密与脱密机制的权衡——基于倾向得分匹配的反事实估计[J]. 情报杂志, 2020, 39(3): 42-50. 74 Yang S L, Qi F, Diao H Y, et al. Do retraction practices work effectively? Evidence from citations of psychological retracted articles[J]. Journal of Information Science, 2024, 50(2): 531-545. 75 Farys R, Wolbring T. Matched control groups for modeling events in citation data: an illustration of Nobel prize effects in citation networks[J]. Journal of the Association for Information Science and Technology, 2017, 68(9): 2201-2210. 76 Xu C E, Zong Q J. The effects of international research collaboration on the policy impact of research: a causal inference drawing on the journal Lancet[J]. Journal of Information Science, 2023. DOI: 10.1177/01655515231174381. 77 Zong Q J, Huang Z H, Huang J R. Do open science badges work? Estimating the effects of open science badges on an article’s social media attention and research impacts[J]. Scientometrics, 2023, 128(6): 3627-3648. 78 Zong Q J, Xie Y F, Liang J C. Does open peer review improve citation count? Evidence from a propensity score matching analysis of PeerJ[J]. Scientometrics, 2020, 125(1): 607-623. 79 宋士杰, 赵宇翔, 韩文婷, 等. 互联网环境下公民健康素养对健康风险的抑制效应分析——基于CHNS数据的慢性病实证研究[J]. 数据分析与知识发现, 2019, 3(4): 13-21. 80 宋士杰, 宋小康, 赵宇翔, 等. 互联网使用对于老年人孤独感缓解的影响——基于CHARLS数据的实证研究[J]. 图书与情报, 2019(1): 63-69. 81 王艳, 毕丽华. 知识管理与知识创新的研究综述与展望[J]. 图书情报工作, 2011, 55(S2): 343-347, 357. 82 陆泉, 刘婷, 张良韬, 等. 面向知识发现的模糊本体融合与推理模型研究[J]. 情报学报, 2021, 40(4): 333-344. 83 Hou J H, Zheng B L, Wang D Y, et al. How boundary-spanning paper sparkles citation: from citation count to citation network[J]. Journal of Informetrics, 2023, 17(3): 101434. 84 Athey S, Imbens G W. The state of applied econometrics: causality and policy evaluation[J]. Journal of Economic Perspectives, 2017, 31(2): 3-32. 85 Louizos C, Shalit U, Mooij J, et al. Causal effect inference with deep latent-variable models[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2017: 6449-6459. 86 Athey S, Imbens G. Recursive partitioning for heterogeneous causal effects[J]. Proceedings of the National Academy of Sciences of the United States of America, 2016, 113(27): 7353-7360. 87 Knaus M C, Lechner M, Strittmatter A. Machine learning estimation of heterogeneous causal effects: empirical Monte Carlo evidence[J]. The Econometrics Journal, 2021, 24(1): 134-161. 88 Hartford J, Lewis G, Leyton-Brown K, et al. Deep IV: a flexible approach for counterfactual prediction[C]// Proceedings of the 34th International Conference on Machine Learning. JMLR.org, 2017: 1414-1423. 89 Amjad M, Shah D, Shen D. Robust synthetic control[J]. Journal of Machine Learning Research, 2018, 19: 1-51. 90 Kolesár M, Rothe C. Inference in regression discontinuity designs with a discrete running variable[J]. American Economic Review, 2018, 108(8): 2277-2304. 91 Cattaneo M D, Idrobo N, Titiunik R. A practical introduction to regression discontinuity designs: foundations[M]. Cambridge: Cambridge University Press, 2019. 92 Zheng E, Tan Y, Goes P, et al. When econometrics meets machine learning[J]. Data and Information Management, 2017, 1(2): 75-83. 93 Liu R Q, Shang Z F, Cheng G. On deep instrumental variables estimate[OL]. (2020-04-30). https://arxiv.org/pdf/2004.14954.