|
|
|
| Automatic Extraction of Grant Elements for Fine-Grained Matching with Outcomes |
| Lu Wei1,2, Yi Fan1,2, Huang Yong1,2, Jiang Yi1,2, Liu Yinpeng1,2, Cheng Qikai1,2 |
1.School of Information Management, Wuhan University, Wuhan 430072 2.Institute of Intelligence and Innovation Governance, Wuhan University, Wuhan 430072 |
|
|
|
|
Abstract Scientific funding plays a vital role in advancing scientific progress. However, in current fund management practices, a mismatch remains between grant outcomes and grant content, highlighting the urgent need for a fine-grained evaluation mechanism to ensure funding effectiveness. A prerequisite for fine-grained matching between grant outcomes and content is the extraction of key grant elements. Existing research primarily focuses on sentence-level rhetorical classification, which lacks the granularity required, whereas problem and method extraction in scientific papers often targets a single research issue, failing to accommodate grants involving multiple sub-problems. To address these limitations, this study focused on extracting key elements from research grant proposals. We defined five core research grant elements: background, questions, methods, objectives, and significance. Three strategies—zero-shot learning, one-shot learning, and fine-tuning—were employed in conjunction with large language models for the extraction of these grant elements. The fine-tuned strategy yielded the best performance, achieving a ROUGE-L score of 0.849, which demonstrates the effectiveness and practical applicability of the fine-tuned model for extracting grant elements. This work lays a methodological foundation for subsequent downstream tasks and provides valuable methodological tools for managing and evaluating scientific research projects and their outcomes.
|
|
Received: 26 February 2025
|
|
|
|
1 龚旭, 赵学文, 李晓轩, 等. 关于国家自然科学基金绩效评估的思考[J]. 科研管理, 2004, 25(4): 1-6. 2 韩磊, 邱源. 学术期刊须警惕基金论文中基金项目不实标注现象[J]. 编辑学报, 2017, 29(2): 151-154. 3 梁继文, 杨建林, 王伟, 等. 科技项目及其成果文献的相关性评估研究[J]. 情报学报, 2022, 41(2): 155-166. 4 Zhao L X, Ma Y H. The matching degrees of research projects with their papers[J]. IFAC-PapersOnLine, 2022, 55(3): 31-36. 5 叶文豪, 王东波, 沈思, 等. 基于孪生网络的基金与受资助论文相关性判别模型构建研究[J]. 情报学报, 2020, 39(6): 609-618. 6 黄颖, 虞逸飞, 郑寅鑫, 等. 基于学科分类和文本主题的科学基金项目与产出论文目标一致性识别研究[J]. 情报学报, 2023, 42(8): 893-905. 7 赵旸, 张智雄, 刘欢, 等. 基金项目摘要的语步识别系统设计与实现[J]. 情报理论与实践, 2022, 45(8): 162-168. 8 程齐凯, 李鹏程, 张国标, 等. 学术文本词汇功能识别——基于标题生成策略和注意力机制的问题方法抽取[J]. 情报学报, 2021, 40(1): 43-52. 9 Li K, Yan E J. Are NIH-funded publications fulfilling the proposed research? An examination of concept-matchedness between NIH research grants and their supported publications[J]. Journal of Informetrics, 2019, 13(1): 226-237. 10 汪雪锋, 陈云, 王志楠, 等. 基于学科交叉与目标一致性的重大研究计划资助绩效评价[J]. 科研管理, 2017, 38(4): 132-144. 11 Maeda T. An approach toward functional text structure analysis of scientific and technical documents[J]. Information Processing & Management, 1981, 17(6): 329-339. 12 Sollaci L B, Pereira M G. The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey[J]. Journal of the Medical Library Association, 2004, 92(3): 364-367. 13 Connor U, Mauranen A. Linguistic analysis of grant proposals: European Union research grants[J]. English for Specific Purposes, 1999, 18(1): 47-62. 14 Feng H Y. Genre analysis of research grant proposals[D/OL]. Vancouver: University of British Columbia, 2002. [2024-10-16]. https://open.library.ubc.ca/soa/cIRcle/collections/ubctheses/831/items/1.0078162. 15 Tseng M Y. The genre of research grant proposals: towards a cognitive-pragmatic analysis[J]. Journal of Pragmatics, 2011, 43(8): 2254-2268. 16 Flowerdew L. A genre-inspired and lexico-grammatical approach for helping postgraduate students craft research grant proposals[J]. English for Specific Purposes, 2016, 42: 1-12. 17 Matzler P P. Grant proposal abstracts in science and engineering: a prototypical move-structure pattern and its variations[J]. Journal of English for Academic Purposes, 2021, 49: 100938. 18 Charles M, Whiteside K. Seeking research funding in a peripheral context: a learner corpus genre study of grant proposal summaries[J]. Journal of English for Academic Purposes, 2024, 71: 101431. 19 马臻. 申请国家自然科学基金: 前期准备和项目申请书的撰写[J]. 中国科学基金, 2017, 31(6): 533-537. 20 张策, 崔永萍, 郭大玮. 撰写国家自然科学基金申请书的技巧及要点[J]. 中国科学基金, 2018, 32(6): 596-599. 21 赵旸, 张智雄, 李婕. 项目申请书摘要文本的语步识别语料构建[J]. 图书情报工作, 2022, 66(21): 97-106. 22 沈雪莹, 欧石燕. 科学文献知识单元抽取及应用研究: 梳理与展望[J]. 情报理论与实践, 2022, 45(12): 195-207. 23 郑梦悦, 秦春秀, 马续补. 面向中文科技文献非结构化摘要的知识元表示与抽取研究——基于知识元本体理论[J]. 情报理论与实践, 2020, 43(2): 157-163. 24 Shen S, Jiang C, Hu H T, et al. A model for the identification of the functional structures of unstructured abstracts in the social sciences[J]. The Electronic Library, 2022, 40(6): 680-697. 25 陆伟, 刘寅鹏, 石湘, 等. 大模型驱动的学术文本挖掘——推理端指令策略构建及能力评测[J]. 情报学报, 2024, 43(8): 946-959. 26 许钦亚, 薛秋红, 钱力, 等. 融合ChatGPT数据增强的学术论文语步识别方法研究[J]. 图书情报工作, 2024, 68(17): 84-94. 27 索传军, 赖海媚. 学术论文问题知识元的类型与描述规则[J]. 中国图书馆学报, 2021, 47(2): 95-109. 28 Touvron H, Martin L, Stone K, et al. Llama 2: open foundation and fine-tuned chat models[OL]. (2023-07-19). https://arxiv.org/pdf/2307.09288. 29 Bai J Z, Bai S, Chu Y F, et al. Qwen technical report[OL]. (2023-09-28). https://arxiv.org/pdf/2309.16609. 30 Jiang A Q, Sablayrolles A, Mensch A, et al. Mistral 7B[OL]. (2023-10-10). https://arxiv.org/pdf/2310.06825. |
|
|
|