|
|
|
| Construction and Evaluation of Intelligent Medical Q&A System for Chronic Diseases: A Hybrid Modeling Approach Based on GraphRAG and MoE |
| Ma Xin1,2,4, Wang Fang1,2,3, Zhang Feng4, Li Zhaochuan4 |
1.Department of Information Resources Management, School of Information and Communication, Nankai University, Tianjin 300071 2.Center for Network Society Governance, Nankai University, Tianjin 300071 3.Nankai University Library, Tianjin 300071 4.Inspur Software Technology Co., Ltd., Jinan 250000 |
|
|
|
|
Abstract With chronic diseases becoming a core challenge in global public health, there is a growing demand for high-quality, personalized, and sustainable health information services for patients. Although large language models (LLMs) have demonstrated significant advantages in medical Q&A tasks recently, existing systems still face bottlenecks such as coarse knowledge scheduling granularity, weak semantic awareness, and poor model response consistency in chronic disease management, which is a long-term, contextualized, and semantically complex scenario. To this end, this study follows the three-stage process of “structured knowledge modeling-hybrid-driven retrieval-large model cooperative response” to propose a fusion approach of graph retrieval augmented generation (GraphRAG) and mixture of experts (MoE) for chronic disease Q&A system. First, a multi-source heterogeneous proprietary knowledge graph covering the elements of disease evolution, lifestyle intervention, and long-term management was constructed. Subsequently, keyword matching was combined with semantic vector recall, multi-model collaborative answering, and multi-subgraph cue fusion three-channel hybrid graph retrieval mechanism, to achieve dynamic knowledge scheduling for complex information requirements. Second, initial screening of user query legitimacy and normality was combined with the user’s implicit intent and fusion subgraphs, through the MoE dynamic gating network on nine open source big models for command-level synergy, to achieve the semantic depth and generation precision of the dual engine enhancement. Finally, common sense and content integrity checking were introduced, along with formatting rules to refine the preliminary generated text and ensure the accuracy, integrity, and linguistic readability of the output results. The CdMedQA test set built for the experiment, covered 118 chronic diseases and three types of management tasks. The performance of the system was validated using a combination of objective index evaluation and subjective satisfaction comparisons. The results show that the proposed system significantly outperformed multiple generic and healthcare vertical large model baselines in terms of accuracy, clarity, personalization, and contextual adaptability. This provides not only a new path for the intelligent management of chronic disease but also theoretical support and technical solutions for the optimization of human-computer interaction and enhanced credibility for the generated content, driven by multi-source knowledge.
|
|
Received: 01 June 2025
|
|
|
|
1 Chioma Ebuenyi M, Schnoor K, Versluis A, et al. Short message services interventions for chronic disease management: a systematic review[J]. Clinical EHealth, 2021, 4: 24-29. 2 世卫组织(WHO)发布《2023世界卫生统计报告》[EB/OL]. (2023-05-23) [2025-05-28]. https://www.sohu.com/a/678189230_120103084. 3 慢性病形势严峻, 致死率占80%以上, 要早做预防[EB/OL]. (2023-05-31) [2025-05-58]. https://www.sohu.com/a/680648695_121014369. 4 Yu G, Tabatabaei M, Mezei J, et al. Improving chronic disease management for children with knowledge graphs and artificial intelligence[J]. Expert Systems with Applications, 2022, 201: 117026. 5 马费成, 周利琴. 面向智慧健康的知识管理与服务[J]. 中国图书馆学报, 2018, 44(5): 4-19. 6 Wang X, Sun Z C, Wang P P, et al. MedicalGLM: a pediatric medical question answering model with a quality evaluation mechanism[J]. Journal of Biomedical Informatics, 2025, 165: 104793. 7 医疗大模型如雨后春笋, 谁能在花园中挖出大大的花?[EB/OL]. (2023-07-20) [2025-05-28]. https://www.sohu.com/a/70449 4575_100150. 8 百度健康发布灵医开放平台, 马斯克要做“全球最强AI”[EB/OL]. (2024-07-23) [2025-05-28]. https://news.qq.com/rain/a/2024 0723A09MEV00?suid=&media_id=. 9 Shen J, Pan T, Xu M, et al. A novel DL-based algorithm integrating medical knowledge graph and doctor modeling for Q&A pair matching in OHP[J]. Information Processing & Management, 2023, 60(3): 103322. 10 席运江, 李曼, 邓雨珊, 等. 中文在线医疗社区问答内容知识图谱构建研究[J]. 图书情报工作, 2024, 68(4): 124-136. 11 洪怡敏, 张晗, 白智瑛. 面向重大慢性疾病健康管理的知识图谱构建及应用[J]. 情报理论与实践, 2024, 47(8): 180-189, 210. 12 谌文佳, 杨琳, 李金林. 嵌入意图识别的医疗健康问答文本语义分类模型[J]. 数据分析与知识发现, 2025, 9(2): 26-38. 13 Kim Y M, Lee T H, Na S O. Constructing novel datasets for intent detection and ner in a Korean healthcare advice system: guidelines and empirical results[J]. Applied Intelligence, 2023, 53(1): 941-961. 14 Tohti T, Abdurxit M, Hamdulla A. Medical QA oriented multi-task learning model for question intent classification and named entity recognition[J]. Information, 2022, 13(12): 581. 15 廖开际, 邹珂欣, 席运江. 一种在线医疗社区问答文本实体识别方法——基于卷积神经网络和双向长短期记忆神经网络[J]. 科技管理研究, 2021, 41(8): 173-179. 16 彭佳儿, 于菲菲, 赵月华. 从患者生成健康数据到用药建议生成——基于图检索增强生成的患者用药信息问答模型[J]. 现代情报, 2025, 45(12): 63-76. 17 Maharjan J, Garikipati A, Singh N P, et al. OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models[J]. Scientific Reports, 2024, 14: 14156. 18 Singhal K, Tu T, Gottweis J, et al. Toward expert-level medical question answering with large language models[J]. Nature Medicine, 2025, 31(3): 943-950. 19 罗顺财, 李庆印, 魏福禄, 等. 基于知识图谱的多源信息融合事故处理系统[J]. 山东理工大学学报(自然科学版), 2024, 38(3): 28-34. 20 Khurana K, Deshpande U. Video question-answering techniques, benchmark datasets and evaluation metrics leveraging video captioning: a comprehensive survey[J]. IEEE Access, 2021, 9: 43799-43823. 21 混合专家模型(MoE)详解[EB/OL]. (2023-12-26) [2025-05-28]. https://zhuanlan.zhihu.com/p/674698482. 22 Jacobs R A, Jordan M I, Nowlan S J, et al. Adaptive mixtures of local experts[J]. Neural Computation, 1991, 3(1): 79-87. 23 Rambabu R, Vadakkepat P, Tan K C, et al. A mixture-of-experts prediction framework for evolutionary dynamic multiobjective optimization[J]. IEEE Transactions on Cybernetics, 2020, 50(12): 5099-5112. 24 He X H, Yan K Y, Li R, et al. Frequency-adaptive pan-sharpening with mixture of experts[C]// Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2024: 2121-2129. 25 Zhou Q Y, Zhang K Y, Yao T P, et al. Adaptive mixture of experts learning for generalizable face anti-spoofing[C]// Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM Press, 2022: 6009-6018. 26 Akbari H, Kondratyuk D, Cui Y, et al. Alternating gradient descent and mixture-of-experts for integrated multimodal perception[C]// Proceedings of the Conference on Advances in Neural Information Processing Systems. Red Hook: Curran Associates, 2023: 79142-79154. 27 Liu J C, Wang J H, Jiang Y M. Janus: a unified distributed training framework for sparse mixture-of-experts models[C]// Proceedings of the ACM SIGCOMM 2023 Conference. New York: ACM Press, 2023: 486-498. 28 张金营, 王天堃, 么长英, 等. 基于大语言模型的电力知识库智能问答系统构建与评价[J]. 计算机科学, 2024, 51(12): 286-292. 29 Matsumoto N, Moran J, Choi H, et al. KRAGEN: a knowledge graph-enhanced RAG framework for biomedical problem solving using large language models[J]. Bioinformatics, 2024, 40(6): btae353. 30 Guo Q, Cao S, Yi Z. A medical question answering system using large language models and knowledge graphs[J]. International Journal of Intelligent Systems, 2022, 37(11): 8548-8564. 31 Sarmah B, Mehta D, Hall B, et al. HybridRAG: integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction[C]// Proceedings of the 5th ACM International Conference on AI in Finance. New York: ACM Press, 2024: 608-616. 32 Andrus B R, Nasiri Y, Cui S L, et al. Enhanced story comprehension for large language models through dynamic document-based knowledge graphs[C]// Proceedings of the 36th AAAI Conference on Artificial Intelligence. Palo Alto: AAAI Press, 2022: 10436-10444. 33 慢性病病种有哪些? 准入标准是如何规定的?[EB/OL]. (2026-01-01) [2026-02-15]. https://ybj.wuhu.gov.cn/ztzl/ybzszl/mxbp/8364583.html. 34 Wu J D, Zhu J Y, Qi Y L, et al. Medical graph RAG: towards safe medical large language model via graph retrieval-augmented generation[PP/OL]. V2. arXiv (2024-10-15). https://arxiv.org/pdf/2408. 04187. 35 马鑫, 王芳. 元宇宙的概念、技术、应用与影响——一项系统性文献综述[J]. 图书情报工作, 2023, 67(18): 113-128. 36 科普中国. 常识知识库[EB/OL]. (2021-12-31) [2025-05-28]. https://www.kepuchina.cn/article/articleinfo?business_type=100& classify=0&ar_id=220010. 37 SuperCLUE智能指数[EB/OL]. (2024-09-02) [2025-05-28]. https://www.superclueai.com/homepage. 38 星火认知大模型websocket接口文档[EB/OL]. (2024-08-15) [2025-05-28]. https://www.xfyun.cn/doc/spark/Web.html. 39 智谱清言[EB/OL]. [2025-05-28]. https://chatglm.cn/. 40 百度千帆·大模型服务及Agent开发平台[EB/OL]. (2024-08-26) [2025-05-28]. https://cloud.baidu.com/doc/WENXINWORKSHOP/s/ dltgsna1o. 41 Llama-2-7B[EB/OL]. (2023-08-31) [2025-05-28]. https://huggingface.co/meta-llama/Llama-2-7b. 42 中文羊驼大模型三期项目(Chinese Llama-3 LLMs) developed from Meta Llama 3[EB/OL]. (2024-05-30) [2025-05-28]. https://github.com/ymcui/Chinese-LLaMA-Alpaca-3. 43 Singh M, Cambronero J, Gulwani S, et al. CodeFusion: a pre-trained diffusion model for code generation[C]// Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: Association of Computational Linguistics, 2023: 11697-11708. 44 使用手册[EB/OL]. [2025-05-28]. https://platform.moonshot.cn/docs/intro#%E6%96%87%E6%9C%AC%E7%94%9F%E6%88% 90%E6%A8%A1%E5%9E%8B. 45 Gemma-7B[EB/OL]. (2024-07-26) [2025-05-25]. https://www.modelscope.cn/models/AI-ModelScope/gemma-7b. 46 API概览[EB/OL]. [2025-05-25]. https://cloud.tencent.com/document/product/1729/101848. 47 Bao Z J, Chen W, Xiao S Z, et al. DISC-MedLLM: bridging general large language models and real-world medical consultation[PP/OL]. arXiv (2023-08-28). https://arxiv.org/pdf/2308.14346. 48 中文医疗对话模型扁鹊(BianQue)[EB/OL]. (2023-06-06) [2025-05-25]. https://github.com/scutcyr/BianQue. 49 本草(原名: 华驼)模型仓库, 基于中文医学知识的大语言模型指令微调[EB/OL]. (2023-06-11) [2025-05-25]. https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese/tree/main. 50 Luo R Q, Sun L A, Xia Y C, et al. BioGPT: generative pre-trained transformer for biomedical text generation and mining[J]. Briefings in Bioinformatics, 2022, 23(6): bbac409. 51 Dai D W, Zhang Y H, Xu L, et al. PA-LLaVA: a large language-vision assistant for human pathology image understanding[C]// Proceedings of the 2024 IEEE International Conference on Bioinformatics and Biomedicine. Piscataway: IEEE, 2024: 3138-3143. 52 Hager P, Jungmann F, Holland R, et al. Evaluation and mitigation of the limitations of large language models in clinical decision-making[J]. Nature Medicine, 2024, 30(9): 2613-2622. 53 Gajjar A A, Kumar R P, Paliwoda E D, et al. Usefulness and accuracy of artificial intelligence chatbot responses to patient questions for neurosurgical procedures[J]. Neurosurgery, 2024, 95(1): 171-178. 54 Ju J R, Meng Q G, Sun F F, et al. Citizen preferences and government chatbot social characteristics: evidence from a discrete choice experiment[J]. Government Information Quarterly, 2023, 40(3): 101785. 55 Chen Y Y, Yan S Z, Liu S J, et al. EmotionQueen: a benchmark for evaluating empathy of large language models[C]// Findings of the Association for Computational Linguistics: ACL 2024. Stroudsburg: Association for Computational Linguistics, 2024: 2149-2176. 56 Schmidgall S, Harris C, Essien I, et al. Addressing cognitive bias in medical language models[PP/OL]. V3. arXiv (2024-02-20). https://arxiv.org/pdf/2402.08113. 57 Colbert L, Hegazi I, Peters K, et al. Medical students’ awareness of overdiagnosis and implications for preventing overdiagnosis[J]. BMC Medical Education, 2024, 24(1): Article No.256. 58 Lukas N, Salem A, Sim R, et al. Analyzing leakage of personally identifiable information in language models[C]// Proceedings of the 44th IEEE Symposium on Security and Privacy. Piscataway: IEEE, 2023: 346-363. |
|
|
|