Large Language Model Driven Academic Text Mining: Parameter-Efficient Fine-Tuning Strategy from the Tuning End
Liu Yinpeng1,2, Lu Wei1,2, Shi Xiang1,2, Liu Jiawei1,2, Cheng Qikai1,2, Huang Yong1,2
1.School of Information Management, Wuhan University, Wuhan 430072 2.Institute of Intelligence and Innovation Governance, Wuhan University, Wuhan 430072
刘寅鹏, 陆伟, 石湘, 刘家伟, 程齐凯, 黄永. 大模型驱动的学术文本挖掘[J]. 情报学报, 2025, 44(9): 1159-1172.
Liu Yinpeng, Lu Wei, Shi Xiang, Liu Jiawei, Cheng Qikai, Huang Yong. Large Language Model Driven Academic Text Mining: Parameter-Efficient Fine-Tuning Strategy from the Tuning End. 情报学报, 2025, 44(9): 1159-1172.
1 张智雄, 于改红, 刘熠, 等. ChatGPT对文献情报工作的影响[J]. 数据分析与知识发现, 2023, 7(3): 36-42. 2 陆伟, 刘寅鹏, 石湘, 等. 大模型驱动的学术文本挖掘——推理端指令策略构建及能力评测[J]. 情报学报, 2024, 43(8): 946-959. 3 曹树金, 曹茹烨. 从ChatGPT看生成式AI对情报学研究与实践的影响[J]. 现代情报, 2023, 43(4): 3-10. 4 Lester B, Al-Rfou R, Constant N. The power of scale for parameter-efficient prompt tuning[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2021: 3045-3059. 5 Liu X, Zheng Y N, Du Z X, et al. GPT understands, too[J]. AI Open, 2024, 5: 208-215. 6 Li X L, Liang P. Prefix-tuning: optimizing continuous prompts for generation[C]// Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2021: 4582-4597. 7 Liu X, Ji K X, Fu Y C, et al. P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2022: 61-68. 8 Hu E J, Shen Y L, Wallis P, et al. LoRA: low-rank adaptation of large language models[C/OL]// Proceedings of the 2022 International Conference on Learning Representations. Appleton: ICLR, 2022. https://iclr.cc/virtual/2022/poster/6319. 9 Dettmers T, Pagnoni A, Holtzman A, et al. QLoRA: efficient finetuning of quantized LLMs[C]// Proceedings of the 37th International Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 10088-10115. 10 Ding N, Qin Y J, Yang G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J]. Nature Machine Intelligence, 2023, 5(3): 220-235. 11 陆伟, 黄永, 程齐凯. 学术文本的结构功能识别——功能框架及基于章节标题的识别[J]. 情报学报, 2014(9): 979-985. 12 陆伟, 马永强, 刘家伟, 等. 数智赋能的科研创新——基于数智技术的创新辅助框架探析[J]. 情报学报, 2023, 42(9): 1009-1017. 13 Liu Y P, Liu J W, Shi X, et al. Let’s learn step by step: enhancing in-context learning ability with curriculum learning[OL]. (2024-06-16). https://arxiv.org/pdf/2402.10738. 14 张恒, 赵毅, 章成志. 基于SciBERT与ChatGPT数据增强的研究流程段落识别[J]. 情报理论与实践, 2024, 47(1): 164-172, 153. 15 时宗彬, 朱丽雅, 乐小虬. 基于本地大语言模型和提示工程的材料信息抽取方法研究[J]. 数据分析与知识发现, 2024, 8(7): 23-31. 16 Kunnath S N, Pride D, Knoth P. Prompting strategies for citation classification[C]// Proceedings of the 32nd ACM International Conference on Information and Knowledge Management. New York: ACM Press, 2023: 1127-1137. 17 陈昱成, 韩涛. 生成式人工智能视角下研究问题与研究方法句生成研究——以高能物理领域为例[J]. 情报杂志, 2024, 43(10): 144-149, 143. 18 罗鹏程, 王继民, 聂磊. 基于生成式大语言模型的文献资源自动分类研究[J]. 情报理论与实践, 2024, 47(12): 174-182. 19 Touvron H, Martin L, Stone K, et al. Llama2: open foundation and fine-tuned chat models[OL]. (2023-07-19) [2024-04-30]. http://arxiv.org/pdf/2307.09288. 20 Zeng A H, Liu X, Du Z X, et al. GLM-130B: an open bilingual pre-trained model[OL]. (2023-10-25) [2024-04-30]. https://arxiv.org/pdf/2210.02414. 21 Jiang A Q, Sablayrolles A, Mensch A, et al. Mistral 7B[OL]. (2023-10-10) [2024-04-30]. http://arxiv.org/pdf/2310.06825. 22 Jiang A Q, Sablayrolles A, Roux A, et al. Mixtral of experts[OL]. (2024-01-08) [2024-04-30]. http://arxiv.org/pdf/2401.04088. 23 Lin C Y. ROUGE: a package for automatic evaluation of summaries[C]// Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004. Stroudsburg: Association for Computational Linguistics, 2004: 74-81. 24 Papineni K, Roukos S, Ward T, et al. BLEU: a method for automatic evaluation of machine translation[C]// Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2002: 311-318. 25 Ma Y Q, Qing L Z, Liu J W, et al. From model-centered to human-centered: revision distance as a metric for text evaluation in LLMs-based applications[C]// Findings of the Association for Computational Linguistics: ACL 2024. Stroudsburg: Association for Computational Linguistics, 2024: 2127-2137. 26 Cornelius J, Lithgow-Serrano O, Mitrovic S, et al. BUST: benchmark for the evaluation of detectors of LLM-generated text[C]// Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2024: 8029-8057. 27 Cohan A, Ammar W, van Zuylen M, et al. Structural scaffolds for citation intent classification in scientific publications[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 3586-3596. 28 Luan Y, He L H, Ostendorf M, et al. Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction[C]// Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2018: 3219-3232. 29 Sadat M, Caragea C. SciNLI: a corpus for natural language inference on scientific text[C]// Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg: Association for Computational Linguistics, 2022: 7399-7409. 30 Eberts M, Ulges A. Span-based joint entity and relation extraction with transformer pre-training[C]// Proceedings of the 24th European Conference on Artificial Intelligence. Amsterdam: IOS Press, 2020: 2006-2013. 31 Wolf T, Debut L, Sanh V, et al. Transformers: state-of-the-art natural language processing[C]// Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations. Stroudsburg: Association for Computational Linguistics, 2020: 38-45. 32 Ren J, Rajbhandari S, Aminabadi R Y, et al. ZeRO-offload: democratizing billion-scale model training[C]// Proceedings of the 2021 USENIX Annual Technical Conference. Redmond: USENIX Association, 2021: 551-564. 33 Rajbhandari S, Ruwase O, Rasley J, et al. ZeRO-infinity: breaking the GPU memory wall for extreme scale deep learning[C]// Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. New York: ACM Press, 2021: 1-14. 34 Devlin J, Chang M W, Lee K, et al. BERT: pre-training of deep bidirectional transformers for language understanding[C]// Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg: Association for Computational Linguistics, 2019: 4171-4186. 35 Beltagy I, Lo K, Cohan A. SciBERT: a pretrained language model for scientific text[C]// Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Stroudsburg: Association for Computational Linguistics, 2019: 3613-3618. 36 Mercier D, Rizvi S T R, Rajashekar V, et al. ImpactCite: an XLNet-based method for citation impact analysis[OL]. (2020-05-05) [2024-04-30]. https://arxiv.org/pdf/2005.06611. 37 Yang Z L, Dai Z H, Yang Y M, et al. XLNet: generalized autoregressive pretraining for language understanding[C]// Proceedings of the 32nd Conference on Neural Information Processing Systems. Red Hook: Curran Associates, 2020: 5730-5740.