|
|
|
| Data Fairness: A New Social Issue in the Era of Digital Economy |
| Huo Chaoguang1,2, Zhao Dongxiang3,4,5,6 |
1.School of Information Resource Management, Renmin University of China, Beijing 100872 2.Institute for AI Governance, Renmin University of China, Beijing 100872 3.School of Information Management, Zhengzhou University, Zhengzhou 450001 4.Research Center for “Double World-Class Project” of Henan Province, Zhengzhou 450001 5.Data Governance Research Center of Henan Province, Zhengzhou 450001 6.Zhengzhou Data Science Research Center, Zhengzhou 450001 |
|
|
|
|
Abstract With the rapid advancement of the digital economy and ongoing evolution of data intelligence technologies, data fairness is increasingly emerging as a critical social issue that society cannot avoid, owing to the long-standing concerns over income, education, and healthcare fairness. Unlike traditional factors (e.g., labor, capital, land, and technology), data raise complex and thorny social equity issues. Given that theoretical research on data fairness in Chinese academia lags behind practices driven by digital intelligence development and the evolving digital economy, this paper systematically reviews relevant global multidisciplinary research, proposes a comprehensive data fairness framework system, and analyzes the six basic principles of data fairness. The paper discusses the five main dimensions and four edge dimensions of data fairness, as well as the data fairness issues involved in the three main stages of data collection and acquisition: processing and analysis, sharing, and utilization in the life cycle of data analysis. This paper systematically discusses and proposes a content framework of data fairness for the first time, enriching existing theoretical research on data fairness, expanding the new connotation of social equity in the era of the digital economy, and providing theoretical support for data governance.
|
|
Received: 30 January 2025
|
|
|
|
1 戎珂, 周迪. 数字经济学[M]. 北京: 清华大学出版社, 2023: 11-14. 2 谢康, 夏正豪, 肖静华. 大数据成为现实生产要素的企业实现机制: 产品创新视角[J]. 中国工业经济, 2020(5): 42-60. 3 安小米, 王丽丽, 许济沧. 欧盟数据经济战略分析与启示[J]. 电子政务, 2019(12): 44-55. 4 Micklitz H W, Helberger N, Kas B, et al. Towards digital fairness[J]. Journal of European Consumer and Market Law, 2024, 13(1): 24-30. 5 洪永淼, 张明. 发展数字经济应注重维护社会公平[N]. 经济参考报, 2021-04-06(7). 6 Mays V M, Echo-Hawk A, Cochran S D, et al. Data equity in American Indian/Alaska native populations: respecting sovereign nations’ right to meaningful and usable COVID-19 data[J]. American Journal of Public Health, 2022, 112(10): 1416-1420. 7 Chen W, Pacheco D, Yang K C, et al. Neutral bots probe political bias on social media[J]. Nature Communications, 2021, 12(1): 5580. 8 李卓卓, 张楚辉. 数据伦理框架: 国际对话与中国化的构建路径[J]. 情报学报, 2024, 43(2): 154-166. 9 霍帆帆, 霍朝光, 马海群. 我国数据治理相关政策量化剖析: 发展脉络、政策主体、政策渊源与政策工具[J]. 情报学报, 2023, 42(12): 1424-1437. 10 夏义堃. 数据管理视角下的数据经济问题研究[J]. 中国图书馆学报, 2021, 47(6): 105-119. 11 Cordelli C. Justice as fairness and relational resources[J]. Journal of Political Philosophy, 2015, 23(1): 86-110. 12 黄秀华. 公平理论研究的历史、现状及当代价值[J]. 广西社会科学, 2008(6): 53-58. 13 卢国琪. 习近平“公平正义”论述对马克思相关理论的丰富和深化[J]. 理论探索, 2021(5): 21-29. 14 中国信息通信研究院. 全球数字治理白皮书(2023年)[R]. 北京: 中国信息通信研究院, 2023. 15 Hoffmann A L. Where fairness fails: data, algorithms, and the limits of antidiscrimination discourse[J]. Information, Communication & Society, 2019, 22(7): 900-915. 16 卢小宾, 霍帆帆, 王壮, 等. 数智时代的信息分析方法: 数据驱动、知识驱动及融合驱动[J]. 中国图书馆学报, 2024, 50(1): 29-44. 17 Vetrò A, Torchiano M, Mecati M. A data quality approach to the identification of discrimination risk in automated decision making systems[J]. Government Information Quarterly, 2021, 38(4): 101619. 18 刘丽, 郭苏建. 大数据技术带来的社会公平困境及变革[J]. 探索与争鸣, 2020(12): 114-122, 199. 19 Cavazos J G, Phillips P J, Castillo C D, et al. Accuracy comparison across face recognition algorithms: where are we on measuring race bias?[J]. IEEE Transactions on Biometrics, Behavior, and Identity Science, 2021, 3(1): 101-111. 20 Girona A E, Yarger L. To impress an algorithm: minoritized applicants’ perceptions of fairness in AI hiring systems[C]// Proceedings of the 19th International Conference on Wisdom, Well-Being, Win-Win. Cham: Springer, 2024: 43-61. 21 赵栋祥. 大数据环境下个人的数字囤积行为研究——基于扎根理论的探索[J]. 中国图书馆学报, 2024, 50(1): 96-114. 22 丁晓东. 数据公平利用的法理反思与制度重构[J]. 法学研究, 2023, 45(2): 21-36. 23 Pah A R, Schwartz D L, Sanga S, et al. How to build a more open justice system[J]. Science, 2020, 369(6500): 134-136. 24 Weissman J S, Hasnain-Wynia R. Advancing health care equity through improved data collection[J]. The New England Journal of Medicine, 2011, 364(24): 2276-2277. 25 李晓洁, 丛亚丽. 健康医疗大数据公平问题研究[J]. 自然辩证法通讯, 2021, 43(8): 8-13. 26 Gee G C, Morey B N, Bacong A M, et al. Considerations of racism and data equity among Asian Americans, Native Hawaiians, and Pacific Islanders in the context of COVID-19[J]. Current Epidemiology Reports, 2022, 9(2): 77-86. 27 Morey B N, Chang R C, Thomas K B, et al. No equity without data equity: data reporting gaps for Native Hawaiians and Pacific Islanders as structural racism[J]. Journal of Health Politics, Policy and Law, 2022, 47(2): 159-200. 28 Ponce N A, Shimkhada R, Adkins-Jackson P B. Making communities more visible: equity-centered data to achieve health equity[J]. The Milbank Quarterly, 2023, 101(S1): 302-332. 29 Durieux M E, Naik B I. Scientia potentia est: striving for data equity in clinical medicine for low- and middle-income countries[J]. Anesthesia and Analgesia, 2022, 135(1): 209-212. 30 吴丹, 郭清玥, 刘静. 人工智能决策性别公平研究: 构件、模式与生态系统[J]. 图书情报知识, 2025, 42(3): 31-43, 98. 31 孟小峰, 王雷霞, 刘俊旭. 人工智能时代的数据隐私、垄断与公平[J]. 大数据, 2020, 6(1): 35-46. 32 Kirat T, Tambou O, Do V, et al. Fairness and explainability in automatic decision-making systems. A challenge for computer science and law[J]. EURO Journal on Decision Processes, 2023, 11: 100036. 33 Jagadish H V, Stoyanovich J, Howe B. COVID-19 brings data equity challenges to the fore[J]. Digital Government: Research and Practice, 2021, 2(2): Article No.24. 34 Schoeffer J, Machowski Y, Kuehl N. A study on fairness and trust perceptions in automated decision making[OL]. (2021-03-08). https://arxiv.org/pdf/2103.04757. 35 Chouldechova A, Roth A. A snapshot of the frontiers of fairness in machine learning[J]. Communications of the ACM, 2020, 63(5): 82-89. 36 Mandal D, Deng S, Jana S, et al. Ensuring fairness beyond the training data[C]// Proceedings of the Conference on Advances in Neural Information Processing Systems. Vancouver: NeurIPS, 2020: 18445-18456. 37 范红霞, 孙金波. 看不见的“大象”: 算法中的性别歧视[J]. 新闻爱好者, 2021(10): 29-32. 38 Albiero V, Krishnapriya K S, Vangara K, et al. Analysis of gender inequality in face recognition accuracy[C]// Proceedings of the 2020 IEEE Winter Applications of Computer Vision Workshops. Piscataway: IEEE, 2020: 81-89. 39 Wang X M, Zhang Y S, Zhu R L. A brief review on algorithmic fairness[J]. Management System Engineering, 2022, 1(1): Article No.7. 40 Catania B, Guerrini G, Accinelli C. Fairness & friends in the data science era[J]. AI & Society, 2023, 38(2): 721-731. 41 Mazilu L, Paton N W, Konstantinou N, et al. Fairness in data wrangling[C]// Proceedings of the 2020 IEEE 21st International Conference on Information Reuse and Integration for Data Science. Piscataway: IEEE, 2020: 341-348. 42 Tawakuli A, Engel T. Make your data fair: a survey of data preprocessing techniques that address biases in data towards fair AI[J]. Journal of Engineering Research, 2025, 13(3): 2362-2369. 43 Liu Y, Taufiq M, Ton J F. Achievable fairness on your data with utility guarantees[C]// Proceedings on the Conference of Advances in Neural Information Processing Systems. Vancouver: NeurIPS, 2024: 140405-140450. 44 Allan V, Ramagopalan S V, Mardekian J, et al. Propensity score matching and inverse probability of treatment weighting to address confounding by indication in comparative effectiveness research of oral anticoagulants[J]. Journal of Comparative Effectiveness Research, 2020, 9(9): 603-614. 45 Deldjoo Y, Bellogin A, Di Noia T. Explaining recommender systems fairness and accuracy through the lens of data characteristics[J]. Information Processing & Management, 2021, 58(5): 102662. 46 Burnaev E, Erofeev P, Papanov A. Influence of resampling on accuracy of imbalanced classification[C]// Proceedings of the Eighth International Conference on Machine Vision. Bellingham: SPIE, 2015, 9875: 987521. 47 Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16: 321-357. 48 孙伟, 黄培伦. 公平理论研究评述[J]. 科技管理研究, 2004, 24(4): 102-104. 49 张嵩, 孙鑫玥, 陈昊. APP隐私侵犯场景下的用户信任修复机制研究——基于公平理论的多重中介模型[J]. 图书情报工作, 2023, 67(13): 36-48. 50 Dencik L, Hintz A, Redden J, et al. Exploring data justice: conceptions, applications and directions[J]. Information, Communication & Society, 2019, 22(7): 873-881. 51 孟令宇. 从算法偏见到算法歧视: 算法歧视的责任问题探究[J]. 东北大学学报(社会科学版), 2022, 24(1): 1-9. 52 贾诗威, 闫慧. 算法偏见概念、哲理基础与后果的系统回顾[J]. 中国图书馆学报, 2022, 48(6): 57-76. 53 Aslaoui Mokhtari K, Benbernou S, Ouziri M, et al. A monitoring framework for transparency and fairness in big data platform[J]. Concurrency and Computation: Practice and Experience, 2021, 33(23): e6069. 54 Abiteboul S, Stoyanovich J. Transparency, fairness, data protection, neutrality: data management challenges in the face of new regulation[J]. Journal of Data and Information Quality, 2019, 11(3): Article No.15. 55 Zhou L Y, Wang W Q, Xu J J, et al. Perceived information transparency in B2C e-commerce: an empirical investigation[J]. Information & Management, 2018, 55(7): 912-927. 56 K?nig P D. Challenges in enabling user control over algorithm-based services[J]. AI & Society, 2024, 39(1): 195-205. 57 von Scherenberg F, Hellmeier M, Otto B. Data sovereignty in information systems[J]. Electronic Markets, 2024, 34(1): Article No.15. 58 张继红. 论我国金融消费者信息权保护的立法完善——基于大数据时代金融信息流动的负面风险分析[J]. 法学论坛, 2016, 31(6): 92-102. 59 李海丹, 洪紫怡, 朱侯. 隐私计算与公平理论视角下用户隐私披露行为机制研究[J]. 图书情报知识, 2016(6): 114-124. 60 Clifford D, Ausloos J. Data protection and the role of fairness[J]. Yearbook of European Law, 2018, 37: 130-187. 61 Wu C. Data privacy: from transparency to fairness[J]. Technology in Society, 2024, 76: 102457. 62 王寅, 范铭, 陶俊杰, 等. 移动应用隐私权声明内容合规性检验方法[J]. 软件学报, 2024, 35(8): 3668-3683. 63 赵景欣, 岳星辉, 冯崇朋, 等. 基于通用数据保护条例的数据隐私安全综述[J]. 计算机研究与发展, 2022, 59(10): 2130-2163. 64 秦长森. 智慧图书馆数据合规模式之完善——以数据分级分类为中心[J]. 图书馆, 2023(1): 36-43. 65 张春春, 孙瑞英. 如何走出AIGC的“科林格里奇困境”: 全流程动态数据合规治理[J]. 图书情报知识, 2024, 41(2): 39-49, 66. 66 梅傲, 潘子俊. 企业跨境数据合规的治理模式、难题审视及合规进路[J]. 情报理论与实践, 2024, 47(6): 60-67. 67 Tatineni S. Ethical considerations in AI and data science: bias, fairness, and accountability[J]. International Journal of Information Technology and Management Information Systems, 2019, 10(1): 11-21. 68 Mao Y C, Shen L J, Wu J, et al. Federated dynamic client selection for fairness guarantee in heterogeneous edge computing[J]. Journal of Computer Science and Technology, 2024, 39(1): 139-158. 69 Salazar R, Neutatz F, Abedjan Z. Automated feature engineering for algorithmic fairness[J]. Proceedings of the VLDB Endowment, 2021, 14(9): 1694-1702. 70 Luo Y, Khan M O, Wen C C, et al. FairDiffusion: enhancing equity in latent diffusion models via fair Bayesian perturbation[J]. Science Advances, 2025, 11(14): eads4593. 71 Rueda J, Rodríguez J D, Jounou I P, et al. “Just” accuracy? Procedural fairness demands explainability in AI-based medical resource allocations[J]. AI & Society, 2024, 39(3): 1411-1422. 72 Lee M K, Jain A, Cha H J, et al. Procedural justice in algorithmic fairness: leveraging transparency and outcome control for fair algorithmic mediation[J]. Proceedings of the ACM on Human-Computer Interaction, 2019, 3: Article No.182. 73 Wang Z M, Huang C W, Yao X. Procedural fairness in machine learning[OL]. (2024-04-02). https://arxiv.org/pdf/2404.01877. 74 Fang B L, Jiang M, Cheng P Y, et al. Achieving outcome fairness in machine learning models for social decision problems[C]// Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. Marina Del Rey: IJCAI, 2020: 444-450. 75 Du R J, Muthirayan D, Khargonekar P P, et al. Long-term fairness for real-time decision making: a constrained online optimization approach[J]. IEEE Transactions on Neural Networks and Learning Systems, 2025, 36(7): 13149-13161. 76 Geneviève L D, Martani A, Perneger T, et al. Systemic fairness for sharing health data: perspectives from Swiss stakeholders[J]. Frontiers in Public Health, 2021, 9: 669463. 77 Kleindessner M, Awasthi P, Morgenstern J. A notion of individual fairness for clustering[OL]. (2020-06-08). https://arxiv.org/pdf/2006.04960. 78 何艳玲, 蒋良竹. 人民算法: 如何理解治理过程中的“人民”[J]. 治理研究, 2024, 40(3): 4-23. 79 Vydra S, Poama A, Giest S, et al. Big data ethics: a life cycle perspective[J]. Erasmus Law Review, 2021, 14(1): 24-44. 80 Agarwal A, Agarwal H. A seven-layer model with checklists for standardising fairness assessment throughout the AI lifecycle[J]. AI and Ethics, 2024, 4(2): 299-314. 81 Drukker K, Chen W J, Gichoya J, et al. Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment[J]. Journal of Medical Imaging, 2023, 10(6): 061104. 82 Chen P, Wu L N, Wang L. AI fairness in data management and analytics: a review on challenges, methodologies and applications[J]. Applied Sciences, 2023, 13(18): 10258. 83 黄海瑛, 李佳兰, 冉从敬. 公共数据公平授权运营机制研究[J]. 图书馆论坛, 2025, 45(11): 25-34. 84 Leonelli S, Lovell R, Wheeler B W, et al. From FAIR data to fair data use: methodological data fairness in health-related social media research[J]. Big Data & Society, 2021, 2021(1). DOI: 10.1177/20539517211010310. |
|
|
|