摘要学术论文画像主要分为基于原文语料的静态画像和基于引文语料的动态画像两大类。本文聚焦后者,将最新的引文内容分析(citation content analysis,CCA)方法引入画像研究,通过收集同行引用语料对论文发表后产生的“学术影响”进行动态刻画和精准标识,画像成果对论文评价和学术资源检索/推荐等下游任务的解决具有重要支撑作用。首先,基于理论分析,构建“影响力-影响域-影响路径”三维度学术论文画像框架(impact power,impact field & impact path,Im-PFP);其次,选择单篇高被引文献“Co-citation in the scientific literature:a new measure of the relationship between two documents”(Small,1973)作为研究案例,运用自建的完整引文语料集,开展论文画像实证研究。实证研究检验了所提三维度画像框架Im-PFP的普适性和应用价值,其可拓展到不同学科领域的单篇或多篇高被引论文的画像研究。画像结果不仅可对案例文献的学术生命周期进行阶段划分,还可对其学术影响力、影响域和影响路径的历时性变化进行深入解读和评价分析。
王心玥, 赵丹群. 基于CCA的学术论文画像:理论框架与实证研究[J]. 情报学报, 2026, 45(3): 363-375.
Wang Xinyue, Zhao Danqun. Academic Paper Profiling Based on CCA: A Theoretical Framework and Empirical Study. 情报学报, 2026, 45(3): 363-375.
1 王东, 李青, 张志刚, 等. 科研人员画像构建方法研究[J]. 情报学报, 2022, 41(8): 812-821. 2 董文慧, 熊回香, 杜瑾, 等. 基于学者画像的科研合作者推荐研究[J]. 数据分析与知识发现, 2022, 6(10): 20-34. 3 王心玥, 赵丹群. 数智时代的学者画像研究: 问题与进路[J]. 图书情报知识, 2026, 43(1): 88-97, 123. 4 郭红梅, 曾建勋. 基于本体的科研机构标签体系研究[J]. 情报学报, 2022, 41(6): 574-583. 5 胡潜, 吴茜, 董寒宇. 基于文献数据化的科研机构历时画像构建研究[J]. 情报理论与实践, 2024, 47(7): 88-96. 6 耿海英, 张建东, 杨立英, 等. 不同学科分类体系下学科结构画像的对比分析[J]. 情报科学, 2023, 41(10): 83-90, 120. 7 胡正银, 刘蕾蕾, 代冰, 等. 基于领域知识图谱的生命医学学科知识发现探析[J]. 数据分析与知识发现, 2020, 4(11): 1-14. 8 Zhang G, Ding Y, Milojevi? S. Citation content analysis (CCA): a framework for syntactic and semantic analysis of citation content[J]. Journal of the American Society for Information Science and Technology, 2013, 64(7): 1490-1503. 9 Zhao D Q, Guo Q Y, Chen H P, et al. Corpus construction and mining for citation context analysis[J]. Data Science and Informetrics, 2021, 1(1): 96-114. 10 丁堃, 赵昕航, 林原, 等. 面向学术评价的论文画像研究[J]. 情报理论与实践, 2022, 45(9): 94-101. 11 吴江. 基于论文画像的科研论文影响力评价方法研究[J]. 四川图书馆学报, 2022(3): 52-56. 12 张吉玉, 张均胜, 乔晓东. 辅助新颖性评估的科技论文评述画像构建方法[J]. 情报理论与实践, 2023, 46(1): 159-167. 13 Sudolska A, Lis A, Chodorek M. Research profiling for responsible and sustainable innovations[J]. Sustainability, 2019, 11(23): 6553. 14 Camara Viana L F, Hoffmann V E, da Silva Miranda Junior N. Regional resilience and innovation: paper profiles and research agenda[J]. Innovation & Management Review, 2023, 20(2): 119-131. 15 Anderson M H, Lemken R K. Citation context analysis as a method for conducting rigorous and impactful literature reviews[J]. Organizational Research Methods, 2023, 26(1): 77-106. 16 陈翀, 李楠, 梁冰, 等. 基于成果特征的学者学术专长识别方法[J]. 图书情报工作, 2019, 63(20): 96-103. 17 徐曾旭林, 谢靖, 于倩倩. 人才多元评价模型设计方法研究[J]. 数据分析与知识发现, 2021, 5(8): 122-131. 18 Meng L, Wu B. Core discovery and relation extraction in organization profiling[C]// Proceedings of the 13th International Conference on Semantics, Knowledge and Grids (SKG). Piscataway: IEEE, 2017: 219-222. 19 田瑞强, 潘云涛. 全面画像视角下的世界一流科技期刊研究[J]. 中国科技期刊研究, 2021, 32(9): 1111-1119. 20 潘飞, 孙文礼, 王骁龙, 等. “中国科技期刊卓越行动计划”资助期刊群体画像构建与分析——以领军期刊与重点期刊为例[J]. 中国科技期刊研究, 2024, 35(6): 831-840. 21 Lu C, Ding Y, Zhang C Z. Understanding the impact change of a highly cited article: a content-based citation analysis[J]. Scientometrics, 2017, 112(2): 927-945. 22 祝清松, 冷伏海. 基于引文内容分析的高被引论文主题识别研究[J]. 中国图书馆学报, 2014, 40(1): 39-49. 23 王心玥, 赵丹群. 引文情感识别研究进展及评述[J]. 情报理论与实践, 2024, 47(1): 173-181, 189. 24 Athar A. Sentiment analysis of citations using sentence structure-based features[C]// Proceedings of the ACL 2011 Student Session. Stroudsburg: Association for Computational Linguistics, 2011: 81-87. 25 Sula C A, Miller M. Citations, contexts, and humanistic discourse: toward automatic extraction and classification[J]. Literary and Linguistic Computing, 2014, 29(3): 452-464. 26 Raza H, Faizan M, Hamza A, et al. Scientific text sentiment analysis using machine learning techniques[J]. International Journal of Advanced Computer Science and Applications, 2019, 10(12): 157-165. 27 Lauscher A, Glava? G, Ponzetto S P, et al. Investigating convolutional networks and domain-specific embeddings for semantic classification of citations[C]// Proceedings of the 6th International Workshop on Mining Scientific Publications. New York: ACM Press, 2017: 24-28. 28 Cohan A, Ammar W, Van Zuylen M, et al. Structural scaffolds for citation intent classification in scientific publications[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 3586-3596. 29 Jha R, Jbara A A, Qazvinian V, et al. NLP-driven citation analysis for scientometrics[J]. Natural Language Engineering, 2017, 23(1): 93-130. 30 李铮, 邓三鸿, 孔嘉, 等. 学者学术影响力识别研究——基于引文全数据的视角[J]. 图书情报工作, 2020, 64(12): 87-94. 31 魏绪秋, 姜召昊, 常霞, 等. 基于引证意图的学术论文创新性评价研究[J]. 情报理论与实践, 2023, 46(9): 24-30, 46. 32 王剑, 高峰, 满芮, 等. 基于引用频次和内容分析的引文分布与动机关系研究[J]. 情报杂志, 2013, 32(9): 100-103. 33 Web of Science. Citation network[EB/OL]. [2025-11-27]. https://webofscience.clarivate.cn/wos/woscc/full-record/WOS:A1990DX 15600001. 34 刘盛博, 王博, 唐德龙, 等. 基于引用内容的论文影响力研究——以诺贝尔奖获得者论文为例[J]. 图书情报工作, 2015, 59(24): 109-114. 35 Al-Jamimi H A, BinMakhashen G M, Bornmann L. Use of bibliometrics for research evaluation in emerging markets economies: a review and discussion of bibliometric indicators[J]. Scientometrics, 2022, 127(10): 5879-5930. 36 Gou Z Y, Meng F, Chinchilla-Rodríguez Z, et al. Encoding the citation life-cycle: the operationalization of a literature-aging conceptual model[J]. Scientometrics, 2022, 127(8): 5027-5052. 37 Marres N, de Rijcke S. From indicators to indicating interdisciplinarity: a participatory mapping methodology for research communities in-the-making[J]. Quantitative Science Studies, 2020, 1(3): 1041-1055. 38 Small H. Co-citation in the scientific literature: a new measure of the relationship between two documents[J]. Journal of the American Society for Information Science, 1973, 24(4): 265-269. 39 Scientometrics[EB/OL]. [2025-11-27]. https://link.springer.com/journal/11192/articles. 40 Hummon N P, Dereian P. Connectivity in a citation network: the development of DNA theory[J]. Social Networks, 1989, 11(1): 39-63. 41 Liu J S, Lu L Y Y, Ho M H. A few notes on main path analysis[J]. Scientometrics, 2019, 119(1): 379-391. 42 Yu D J, Yan Z P. Main path analysis considering citation structure and content: case studies in different domains[J]. Journal of Informetrics, 2023, 17(1): 101381. 43 Van Raan A F J. Comments on Henry Small, recipient of the 1987 Derek de Solla Price award[J]. Scientometrics, 1988, 14(5): 361-363. 44 OECD. Making open science a reality[R]. Paris: OECD Publishing, 2015: 25. 45 Kwok K L. A probabilistic theory of indexing and similarity measure based on cited and citing documents[J]. Journal of the American Society for Information Science, 1985, 36(5): 342-351. 46 Li P, Liu H Y, Yu J X, et al. Fast single-pair simrank computation[C]// Proceedings of the 2010 SIAM International Conference on Data Mining. Philadelphia: Society for Industrial and Applied Mathematics, 2010: 571-582. 47 Yu W R, McCann J, Zhang C Y, et al. Scaling high-quality pairwise link-based similarity retrieval on billion-edge graphs[J]. ACM Transactions on Information Systems, 2022, 40(4): Article No.78. 48 Belter C W. A bibliometric analysis of NOAA’s office of ocean exploration and research[J]. Scientometrics, 2013, 95(2): 629-644. 49 Small H, Griffith B C. The structure of scientific literatures Ⅰ: identifying and graphing specialties[J]. Science Studies, 1974, 4(1): 17-40. 50 Griffith B C, Small H G, Stonehill J A, et al. The structure of scientific literatures Ⅱ: toward a macro- and microstructure for science[J]. Science Studies, 1974, 4(4): 339-365. 51 White H D, Griffith B C. Author cocitation: a literature measure of intellectual structure[J]. Journal of the American Society for Information Science, 1981, 32(3): 163-171. 52 Culnan M J. Mapping the intellectual structure of MIS, 1980-1985: a co-citation analysis[J]. MIS Quarterly, 1987, 11(3): 341-353. 53 Batisti? S, ?erne M, Vogel B. Just how multi-level is leadership research? A document co-citation analysis 1980–2013 on leadership constructs and outcomes[J]. The Leadership Quarterly, 2017, 28(1): 86-103. 54 Batisti? S, van der Laken P. History, evolution and future of big data and analytics: a bibliometric analysis of its relationship to performance in organizations[J]. British Journal of Management, 2019, 30(2): 229-251. 55 Zupic I, ?ater T. Bibliometric methods in management and organization[J]. Organizational Research Methods, 2015, 18(3): 429-472. 56 Cobo M J, López-Herrera A G, Herrera-Viedma E, et al. Science mapping software tools: review, analysis, and cooperative study among tools[J]. Journal of the American Society for Information Science and Technology, 2011, 62(7): 1382-1402. 57 McCain K W. The author cocitation structure of macroeconomics[J]. Scientometrics, 1983, 5(5): 277-289. 58 McCain K W. Longitudinal author cocitation mapping: the changing structure of macroeconomics[J]. Journal of the American Society for Information Science, 1984, 35(6): 351-359. 59 Zitt M, Bassecoulard E. Reassessment of co-citation methods for science indicators: effect of methods improving recall rates[J]. Scientometrics, 1996, 37(2): 223-244. 60 Schneider J W, Borlund P. Introduction to bibliometrics for construction and maintenance of thesauri: methodical considerations[J]. Journal of Documentation, 2004, 60(5): 524-549. 61 Cobo M J, López-Herrera A G, Herrera-Viedma E, et al. SciMAT: a new science mapping analysis software tool[J]. Journal of the American Society for Information Science and Technology, 2012, 63(8): 1609-1630.